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Abstract. From a nanoscicnce perspective, cellular processes and their reduced in vitro imitations 
provide extraordinary examples for highly robust few or single molecule reaction pathways. A 
prime example are biochemical reactions involving DNA molecules, and the coupling of these 
reactions to the physical conformations of DNA. In this review, we summarise recent results 
on the following phenomena: We investigate the biophysical properties of DNA-looping and the 
equilibrium configurations of DNA-knots, whose relevance to biological processes are increasingly 
appreciated. We discuss how random DNA-looping may be related to the efficiency of the target 
search process of proteins for their specific binding site on the DNA molecule. And we dwell on 
the spontaneous formation of intermittent DNA nanobubbles and their importance for biological 
processes, such as transcription initiation. The physical properties of DNA may indeed turn out 
to be particularly suitable for the use of DNA in nanosensing applications. 
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I. INTRODUCTION 

Deoxyribonucleic acid (DNA) is the molecule of life 
as we know it.^ It contains all information of an entire 
organism.^ This information is copied during cell divi- 
sion with an extremely high fidelity by the replication 
mechanism. Despite the rather high chemical and phys- 
ical stability of DNA, due to constant action of enzymes 
and other binding proteins (mismatches, rupture) as well 
as potential environmentally induced damage (radiation, 
chemicals), this low error rate, i.e., the suppression of the 
liability to mutations, is onlypossible with the constant 
action of repair mechanisms HSSi)- Although DNA's 
structural and mechanical properties are rather well es- 
tablished for isolated DNA molecules (starting with Ros- 
alind Franklin's X-ray diffraction images (@)), the charac- 
terisation of DNA in its cellular environment, and even in 
vitro during interaction with binding proteins, is subject 
of ongoing investigations. 

Recent advances in experimental techniques such as 
fluorescence methods, atomic force microscopy, or opti- 
cal tweezers have leveraged the potential to both probe 
and manipulate the equilibrium and out of equilibrium 
behaviour of single DNA molecules, making it possible 
to explore DNA's physical and mechanical properties as 
well as its interaction with other biopolymers, such as the 
DNA-protein interplay during gene regulation or repair 
processes. An important ingredient is the coupling to 
thermal activation due to the highly Brownian environ- 
ment. Although mostly performed in vitro, these exper- 
iments provide access to increasingly refined information 
on the nature of DNA and its environment-controlled be- 
haviour. 

In addition to chromosomal packaging inside the nu- 
cleus of eukaryotic cells and the concentration of DNA 
in the membraneless nucleoid region of prokaryotes, the 
global structure of the DNA molecule can be affected 
by topological entanglements. Thus, by error or design 
a DNA molecule can attain a knotted or concatenated 
state, reducing or inhibiting biologically relevant func- 
tions, for instance, replication or transcription. Such en- 
tangled states can be actively reduced by enzymes of the 
topoisomerase family. Their precise action, in particular, 
how they determine the presence of an entangled state, is 
not fully known. Current studies therefore aim at shed- 
ding light on possible mechanisms, in particular, in view 
of the importance of topoisomerase action (or better, its 
inhibition) in tumour proliferation. Other applications 
may be directed towards the treatment of viral deceases 
by modifying the packaging of viral DNA to create knots 



^ Our DNA world during biotic and prcbiotic evolution was sup- 
posedly preceded by an RNA world and, quite likely, by sugarless 
nucleic acids. 

^ A small fraction of genetic information is stored on DNA that 
is kept at other regions of the cell and not replicated on cell 
division, such as mitochondrial or ribosomal DNA. 



in the virus capsid and prevent ejection of the DNA into 
a host, and thereby infection. DNA knots are also be- 
ing recognised as a potential complication in the use of 
nanochannels for DNA separation and sequencing. In 
such confined geometries DNA knots are created with 
appreciable probability, affecting the reliability of these 
techniques. Similarly to DNA knots, DNA looping is 
intimately connected to the function of DNA. Current 
results on DNA looping and DNA knot behaviour are 
summarised in the first parts of this review. 

The Watson-Crick double-helix represents the thermo- 
dynamically stable state of DNA at moderate salt con- 
centrations and below the melting temperature. This 
stability is effected by Watson-Crick hydrogen bonding 
and the stronger base stacking of neighbouring base-pairs 
(bps). However, even at room temperature DNA locally 
opens up intermittent fiexible single-stranded domains, 
so-called DNA-bubbles. Their size typically ranges from 
a few broken bps, increasing to some 200 broken bps 
closer to the melting temperature. The thermal melt- 
ing of DNA has traditionally been used to obtain the 
sequence-dependent stability parameters of DNA. More 
recently, the role of intermittent bubble domains has 
been investigated with respect to the liability of DNA- 
denaturation induced by proteins that selectively bind to 
single-stranded DNA. It has been speculated that due to 
the liability to denaturation of the TATA motif bubble 
formation may add in transcription initiation. The dy- 
namics of single bubbles can be monitored by fiuorescence 
methods, opening a window to both study the breathing 
of DNA experimentally, but also to obtain high precision 
DNA stability data. Finally, bubble dynamics has been 
suggested as a useful tool in optical nanosensing. DNA 
breathing is the topic of the second part of this work. 

Essentially all the biological functions of DNA rely on 
site-specific DNA-binding proteins locating their targets 
(cognate sites) on the DNA molecule, and therefore re- 
quire searching through megabases of non-target DNA in 
a highly efficient manner. For instance, gene regulation 
is performed by specific regulatory proteins. On binding 
to a promoter area on the DNA, they recruit or inhibit 
binding of RNA polymerase and subsequent transcrip- 
tion of the associated gene. The search for the cognate 
site is in fact facilitated by the DNA molecule: in addi- 
tion to three-dimensional search it enables the proteins 
to also move one-dimensionally along the DNA while be- 
ing non-specifically bound. Moreover, at points where 
the DNA loops back on itself, this polymeric conforma- 
tion provides shortcuts for the proteins in the chemical 
coordinate along the DNA, approximately giving rise to 
search-efficient Levy flights. Target search is currently a 
very active field of research, and single molecule meth- 
ods have been shown to provide essential new informa- 
tion. Moreover, the architecture of more complex pro- 
moters relying on the simultaneous presence of several 
regulatory proteins is being investigated to create in sil- 
ico circuits for highly sensitive chemical probes in small 
volumes. Such nanosensing applications are expected to 



be of great importance in microarrays or other nano- and 
microapplications. The third part of this review deals 
with diffusional aspects of gene regulation. 

At the same time DNA's role in classical polymer 
physics is increasingly appreciated. With the possibility 
to reproduce DNA with extremely low error rate by the 
PCR'^, monodisperse samples can be prepared. While 
shorter single-stranded DNA can be used as a model 
for flexible polymers, the double strand exhibits a semi- 
flexible behaviour with a persistence length, that can be 
easily probed experimentally. Moreover, DNA is orders 
of magnitude longer than conventional polymers. Com- 
bined with the potential of single molecule probing. DNA 
is advancing as a model polymer. 

After an introduction to the properties of DNA we ad- 
dress these functional properties of DNA from the per- 
spective of biological relevance, physical behaviour and 
nanotechnological potential. Most emphasis will be put 
on the single molecular aspects of DNA. We note that 
this is not intended to be an exhaustive review on the 
physical properties of DNA. Rather, we present some im- 
portant features and their consequences from a personal 
perspective. 



II. PHYSICAL PROPERTIES AND BIOLOGICAL 
FUNCTION OF DNA 

Biomolccules, that occur naturally in biological sys- 
tems, can be grouped into unspecific oligo- and macro- 
molecules and biopolymers in the stricter sense dll). Un- 
specific biomolecules are produced by biological organ- 
isms in a large range of molecular weight and structure, 
such as polysaccharides (cellulose, chitin, starch, etc.), 
higher fatty acids, actin filaments or microtubules. Also 
the natural 'india-rubber' from the Hevea Brasiliensis 
tree, historically important for both industrial purposes 
and the development of polymer physics belongs to 
this group. 

Biopolymers in the stricter sense we are going to as- 
sume here comprise the polynucleotides DNA and RNA 
consisting of the four-letter nucleotide alphabet with 
A-T and G-C (A-U and G-C for RNA) bps, and the 
polypeptidic proteins consisting of 20 different amino 
acids, each coded for by 3 bases (codons) in the RNA 
(P; 0; H; H). We will come back to proteins later when 
reviewing binding protein-DNA interactions. Biopoly- 
mers are copied and/or created according to the infor- 
mation flow sketched in figure [I] the so-called central 
dogma of molecular biology, a term originally coined by 
Frances Crick ^ . Accordingly, starting from the genetic 



^ Polymerase Chain Reaction: thermal dcnaturation of a DNA 
molecule into two single strands and subsequent cooling in a 
solution of single nucleotides and invariable primers, produces 
two new complete double-stranded DNA molecules. Cycling of 
this process produces large, monodisperse quantities of DNA. 
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FIG. 1 Central dogma of molecular biology after F. Crick: 
Potentially, information flow is completely symmetric between 
the three levels of cellular biopolymers (DNA, RNA, pro- 
teins). However, the recognised pathways are only those rep- 
resented here, where solid lines represent probable transfers, 
and dotted lines for (in principle) possible transfers @. 
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FIG. 2 Ladder structure of the DNA formed by its four 
building-blocks A, G, C, and T, giving rise to the typical 
double-helical structure of DNA. A-T bps establish 2 H- 
bonds, G-C bps 3 H-bonds. 



code stored in the DNA (in some cases in RNA) DNA 
is copied by DNA polymerase (rephcation), and the pro- 
teins as the actually task-performing biopolymers are cre- 
ated via messenger RNA (created by DNA transcription 
through RNA polymerase) and further by translation in 
ribosomes to proteins.^ 

DNA is made up of the four bases (P; 0; i; i; [I^; 



* Alternatively, the genetic code can be transcribed into transfer 
and ribosomal RNA that is not translated into proteins. 
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[TTh : A(denine), G(uanine), C(ytosine), and T(hymine) 
that form the DNA ladder structure shown in figure [51 
These building-blocks A, G, C, T bp according to the 
key-lock principle as A-T and G-C, where the AT bond 
is weaker than the GC bond in terms of stability. Apart 
from the Watson-Crick base-pairing energy, the stability 
of dsDNA is effected by the stacking interactions, the 
specific matching of subsequent bps along the double- 
strand, i.e., bp-bp interactions. In standard literature, 
the stacking interactions are hsted for pairs of bps (e.g., 
for AT-GC, AT- AT, AT-TA, etc.), see below.^ 

Based on this AGCT alphabet, the primary structure 
of DNA can be specified. DNA's six local structural ele- 
ments twist, tilt, roll, shift, slide, and rise are effected by 
the stacking interactions between vicinal bps. In figure[31 
we show a map with the structure elements of the entire 
E.coli genome, demonstrating the degree of structural in- 
formation currently available. These structural elements 
define the local geometrical structure of DNA within a 
typical correlation (persistence) length^ of about 150 bps 
corresponding to 50 nm (the bp-bp distance measures 3.4 
A, reflecting the rather complex chemical structure of a 
nucleotide in comparison to the monomer size of man- 
made polymers such as polyethylene) (fl^ ; [Tsl ; [13; [IB) • On 
a larger scale, much longer than the persistence length, 
DNA becomes flexible. On this level, tertiary structural 
elements come into play. One example is DNA looping, 
that is the formation of polymeric lasso loops induced 
by chemical bonds between binding proteins attached to 
the DNA at specific bp s which are remote along the DNA 
backbone Q; [H; [l3; [11 [H [13) . An extreme hmit of 
tertiary structure is the packaging of DNA onto histones 
and further wrapping into the chromosomes of eukary- 
otic cells (0; [H [H). At the same time, dsDNA may 
locally open into floppy ssDNA bubbles, with a persis- 
tence length of a few bases. ^ These fluctuation-induced 
bubbles increase their statistical weight at higher tem- 
peratures, until the dsDNA fully denatures (melts). We 
will come back to DNA denaturation bubbles below. De- 
pending on the external conditions, DNA occurs in sev- 
eral configurations. Under physiological conditions, one 
is concerned with B-DNA, but there are other states such 
as A, B', Z, ps, triplex DNA, quadruplex DNA, cruciform, 
and H, reviewed, for instance, in (jl2l ;[il). DNA occurs 
naturally in a large range of length scales. In viruses, 
DNA is of the order of a few fim long. In bacteria, it al- 
ready reaches lengths of several mm, and in mammalian 



^ Longer ranging bp-bp interactions are most likely small in com- 
parison. 

^ The persistence length of a polymer chain defines the charac- 
teristic length scale above which the polymer is susceptible to 
bending induced by thermal fluctuations, i.e., it is the length 
scale above which the tangent-tangent correlation decays along 
the chain, see the Appendix. 

In fact, it has been questioned whether there is a meaningful 
value of the persistence length of ssDNA at all, due to its signif- 
icant apparent sequence dependence l l23t) . 



cells it can reach the order of a few m, roughly 2 m in a 
human cell and 35 m in a cell of the South American lung- 
fish, albeit split up into the individual chromosomes (|3)- 
DNA in bacteria in vivo, or extracted from bacteria and 
higher cells for our purposes can therefore be viewed a 
fully flexible polymer with a persistence length of roughly 
50 nm, being governed by generic effects independent of 
the detailed sequence. On short scales DNA becomes 
scmiflexible and governed by the worm-like chain model 
(Kratky-Porod model) ((2^): on even shorter scales, lo- 
cal structural elements become important (in particular, 
for recognition by binding proteins {11)), and eventually 
molecular resolution is reached. 

Stacking interactions govern the local structure of ds- 
DNA. Globally, an additional constraint arises due to 
the circular nature of the DNA, since it has to satisfy the 
conservation law ((25l : [26l : [27h 

Lk = Tw + Wr, (1) 

where Lk stands for the linking number, Tw for the twist, 
and Wr for the writhe of the double helix. The linking 
number Lk is an integer and formally given by one-half 
the number of signed crossings of one DNA strand with 
the other in any regular projection of the molecule. Lk 
is a topological property, and no deformation of a closed 
DNA, without breaking and rejoining the DNA strands, 
will alter it. Tw is equal to the number of times that 
the two strands of DNA wind about the central axis of 
the molecule, and Wr is a number whose absolute value 
equals approximately the number of times that the DNA 
axis winds about itself.^ Whereas Tw is a property of 
the double-helical structure of DNA, Wr is a property 
of the DNA axis alone. Tw and Wr do not need to be 
integers and are not conserved, but coupled through Lk 
by equation ([T|). A nicked circular DNA, i.e., when the 
twist can fully relax, carries Lkg ~ N/h links, where N is 
the number of bp and h {h ~ 10.5 in B-DNA) the number 
of bps per turn. 

The degree of supercoiling of DNA can be expressed in 
terms of the linking number difference, ALk = Lk — Lk^. 
The DNA of virtually all terrestrial organisms is under- 
wound or negatively supercoiled, i.e., ALk < (figures [4] 
and [5]).^ Often, the superhelical density a = ALk/Lko 
is used; most supercoiled DNA molecules isolated from 
either prokaryotes or eukaryotes have a values between 
—0.05 and —0.07 ((29l ). Negative supercoiling is regulated 
in prokaryotes by DNA gyrase; eukaryotes lack gyrase 
but maintain negative supercoiling through winding of 
DNA around nucleosomes and interactions with DNA- 
unwinding proteins. There are two forms of intracellu- 
lar supercoiling, the plectonemic form, characteristic of 



* For details about the calculation of Tw and Wr for representative 
models of DNA, see ||28|) . 

^ An exception are thermophilic organisms living near undersea 
geothermal vents that have positively supercoiled DNA in order 
to stabilise the double helix at extreme temperatures. 
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FIG. 3 Structure atlas of the E.coli genome. Figure courtesy David Ussery, Technical University of Denmark. The structure 
atlas is available under the URL www.cbs.dtu.dk/services/GenomeAtlas/. 




FIG. 4 Right-handed (negative), normal, and left-handed 
(positive) superhelix. The DNA of virtually all terrestrial 
organisms is negatively supercoiled. 

plasmid DNA and accessible, nucleosome-free regions of 
chromatin, and the toroidal or solenoidal form, where su- 
percoiling is attained by DNA wrapped around histone 
octamers or prokaryotic non-histone DNA-binding pro- 
teins (figure [6]). The former is the active form of su- 
percoiled DNA and is freely accessible to proteins in- 
volved in transcription, replication, recombination and 
DNA repair. The latter is the stored form of supercoiled 



DNA and is largely responsible for the extraordinary de- 
gree of compaction required to condense typical genomes 
into the cell's nucleus.^" Negative supercoiling facilitates 
the local unwinding of DNA by providing a ubiquitous 
source of free energy that augments the unwinding free 
energy accompanying the interactions of many proteins 
with their cognate DNA sequences. The local unwind- 
ing of DNA, in turn, is an integral part of many bio- 
logical processes such as gene regulation and DNA repli- 
cation (see section IVip . Therefore, understanding the 
interplay of supercoiling and local helical structure is 
essential to the understanding of biological mechanisms 
(|Il;[3l|;[33;[3i[3i. 

Ribonucleic acid (RNA) consists of the same build- 
ing blocks as DNA, with the exception that T(hymine) 
is replaced by U(racile) (fllh . RNA typically occurs in 
single-stranded form. Therefore, its secondary structure 



The nucleus of a human cell has a radius of circa 5 and stores 
the 2 m of the human genome ||30| ). 
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FIG. 5 Electron micrographs of nicked (left) and supcrcoilcd 
(right) 6996-bp plasmid DNAs. The supercoiled example is 
from a population of DNA molecules with an average super- 
helix density, a— -0.027, close to the value expected m vivo. 



from the glnALG operon (0). The size of DNA loops 
formed in these systems varies between approximately 
100 and 600 bps. In cukaryotes, a variety of transcrip- 
tion factors bind to enhancers that are hundreds to sev- 
eral thousand bps away from their promoters and inter- 
act with RNA polymerases directly or through media- 
tors in order to achieve combinatorial gene regulation 
(|45[ ). DNA looping is required to juxtapose two recombi- 
nation sites in intramolecular site-specific recombination 
(0; 113; El) and is also employed by a number of re- 
striction endonucleases such as SfH and NgoMW, which 
recognise and cut two copies of well-separated cognate 
sites simultaneously (|49l : [sol ; [Slh . Here we describe a re- 
cent statistical-mechanical theory of loop formation that 
connects global mechanical and geometric properties of 
both DNA and protein and demonstrates the importance 
of protein flexibility in loop-mediated protcin-DNA inter- 
actions (153; [55 





FIG. 6 Toroidal (left) and plectonemic (right) forms of su- 
percoiled DNA. 



is richer, being characterised by sequences of hairpins: 
Smaller regions in which chemically remote sequences 
of bases match, pair and form hairpins which are stiff 
and energy-dominated, similar to dsDNA. The remain- 
ing regions form entropy-dominated floppy loops, analo- 
gous to the ssDNA bubbles. Additional tertiary struc- 
ture in RNA comes about by the formation of so-called 
pseudoknots, chemical bonds established between bases 
sitting on chemically distant segments of the secondary 
structure. In RNA-modclling the incorporation of pseu- 
doknots is a non-trivial problem, which currently re- 
ceives considerable interest; sec, for instance, references 
([Ill;[35,;^;^). 



III. DNA-LOOPING 

The formation of DNA loops mediated by proteins 
bound at distant sites along a single molecule is an es- 
sential mechanistic aspect of many biological processes 
including gene regulation, DNA replication, and recom- 
bination (for reviews, see ((ssl : [soh). In E. coli, DNA 
looping represses gene expression at the ara, gal, lac, and 
deo operons (jlO; [4l|; [13; fisi ) and activates transcription 



A. Biological significance of DNA looping 

The biological importance of DNA loop formation is 
underscored by the abundance of architectural proteins 
in the cell such as HU, IHF, and HMG, which facili- 
tate looping by bending the intervening DNA between 
protein- recognition sites (fs^ ). Moreover, DNA looping 
has been shown to be subject to regulation through the 
binding of effector molecules that alter protein conforma- 
tion or protein-DNA interactions (fssl ). 

Two characteristics of DNA looping have been demon- 
strated by in vitro and in vivo experiments. One is co- 
operative binding of a protein to its two cognate sites, 
which can be demonstrated by footprinting methods (jH). 
DNA looping can increase the occupancies of both bind- 
ing sites; in particular, it can significantly enhance pro- 
tein association to the lower-affinity site because of the 
tethering effect of DNA looping. This is a general mech- 
anism by which many transcription factors recruit RNA 
polymerases in gene regulation. Another hallmark is 
the helical dependence of loop formation ((ssl : [40l ) , which 
arises because of DNA's limited torsional flexibility and 
the requirement for correct torsional alignment of the two 
protein-binding sites. Although many methods have been 
developed to directly observe DNA looping in vitro, such 
as scanning-probe |13) and electron microscopy (fioh . and 
single- molecule techniques (fSTt ). assays based on helical 
dependence have been the only way to identify DNA 
looping in vivo. In these experiments, the DNA length 
between two protein binding sites is varied and the yield 
of DNA loop formation is monitored, for example by the 
repression or activation of a reporter gene (|58l). Using 
this helical-twist assay, DNA looping in the ara operon 
was flrst discovered (|40r). 

Our knowledge about the roles of DNA bending, twist, 
and their respective energetics in DNA looping has come 
largely from analyses of DNA cyclisation (fssl : [sol : [60h . 
Circularisation efficiencies of DNA fragments, which are 
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quantitatively described by J- factors, oscillate with DNA 
length and therefore torsional phase ([6ll ; [63 ). The J- 
factor is defined as the ratio of the partition function 
of a circularised polymer chain to that of an open chain. 
Since there is a dimension reduction due to circularisation 
constraints (two polymer ends have to meet), the ratio 
has a unit of concentration, or l/L^ with L represent- 
ing length; see (|53 ) for details. In the present context, 
the J-factor is equal to the free DNA-end concentration 
whose bimolecular ligation efficiency equals that of the 
two ends of a cyclising DNA molecule (|63l ). For short 
DNA fragments J-factors are limited by the significant 
bending and twisting energies required to form closed cir- 
cles, whereas for long DNA, the chain entropy loss during 
circularisation exceeds the elastic-energy decrease and re- 
duces the J-factor. Because of this competition between 
bending and twisting energetics and entropy, there is an 
optimal DNA length for cyclisation |53). Analogous be- 
haviour has been expected for DNA looping, especially 
with respect to the helical dependence discussed above. 

Quantitative analyses of DNA looping and cyclisation 
are challenging problems in statistical mechanics and 
have been largely limited to Monte Carlo or Brownian 
dynamics simulations (|6^ ; [651 : [66l : [67l ; [gl). Analytical 
solutions are available only for some ideal and special 
cases. An important contribution in this area is the the- 
ory of Shimada and Yamakawa ((6^ , which is based on a 
homogeneous and continuous elastic rod model of DNA. 
This theory has been applied extensively to DNA cycli- 
sation (dU; [l3) and also DNA looping ((5§; [13; [ll • The 
Shimada- Yamakawa theory makes use of a perturbation 
approach, in which small configurational fluctuations of 
a DNA chain around the most probable conflguration are 
accounted for in the evaluation of the partition function. 

The elastic-equilibrium conformation is obvious for the 
homogeneous DNA circle studied by Shimada and Ya- 
makawa (|69[). However, the search for the elastic-energy 
minimum of homogeneous DNA molecules with complex 
geometry, such as in DNA looping, supercoiling, and 
the case of inhomogcneous DNA sequences containing 
curvature and nonuniform DNA flexibility, is not triv- 
ial (dX; T2; 73. ) . Recently, a statistical-mechanical theory 
for sequence-dependent DNA circles has been developed 
((s^) and applied to the problem of DNA cyclisation (jS^ 
and DNA looping ([ssl ). In this model, the DNA conflgu- 
ration is described by parameters defined at dinucleotide 
steps, i.e., tilt, roll, and twist, which allows straightfor- 
ward incorporation of intrinsic or protein-induced DNA 
curvature at the bp level. Following Shimada and Ya- 
makawa's method, the theory first determines the me- 
chanical equilibrium conflguration in small DNA circles 
(i.e., less than ~ 1000 bp) under certain constraints; fluc- 
tuations around the equilibrium conflguration are then 
taken into account using an harmonic approximation. 
The new method is much more computationally efflcient 
than Monte Carlo simulation, has comparable accuracy, 
and has been applied successfully to analyse experimen- 
tal results from DNA cyclisation ((53). 



The basis of the extension of the model to DNA looping 
((ssl ) is to treat the protein subunits as connected rigid 
bodies and to allow for a limited number of degrees of 
freedom between the subunits. Motions of the subunits 
are assumed to be governed by harmonic potentials and 
an associated set of force constants, neglecting the an- 
harmonic terms often required for proteins undergoing 
large conformational fluctuations among their modular 
domains. Indeed, the use of a harmonic approximation 
is supported by the success of continuum elastic mod- 
els that are based only on shape and mass-distribution 
information in descriptions of protein motion (f?^ . Sim- 
ilar to the description used for individual DNA bps in 
the model, protein geometry and dynamics are described 
by three rigid-body rotation angles (tilt, roll, and twist). 
Therefore, DNA looping can be viewed as a generalisa- 
tion of DNA cyclisation in which the protein component 
is characterised by a particular set of local geometric con- 
straints and elastic constants. This treatment not only 
unifles the theoretical descriptions of DNA cyclisation 
and looping, but also allows consideration of flexibilities 
at protcin-DNA and protein-protein interfaces and appli- 
cation of the concepts of linking number and writhe. In 
previous work, proteins were considered rigid and their 
effects on DNA conflguration were represented by a set 
of constraints applied to DNA ends (|38l : [t^; [76h . With 
the present approach, programs developed for analysing 
DNA cyclisation can be used to analyse DNA looping 
with only minor modiflcations. 



The new method ((s^ : [ssl ) is most applicable to the 
problem of short DNA loops, in which the free energy 
of a wormlike chain is dominated by bending and tor- 
sional elasticity l(53;[53[). Possible modes of DNA self con- 
tact and contacts between protein and DNA at positions 
other than the binding sites are not considered. For large 
loops contributions to the free energy from chain entropy 
and DNA-DNA contacts can become highly signiflcant. 
Several alternative treatments of DNA looping have ap- 
peared recently. One of these addresses the excluded- 
volume contribution to DNA looping within large open- 
circular molecules ([20[ ). whereas two others consider the 
effect on looping of traction at the ends of a DNA chain 
([77l:[78l). None of these treatments includes helical phas- 
ing effects on DNA looping. In contrast, a method based 
on the Kirchhoff clastic-rod model, which includes the 
helical-phase dependence, has been presented Ifldt . [79h . 
However, this approach does not include thermal fluctu- 
ations per se and therefore is not directly applicable to 
calculations of the J-factor. The cornprehensive treat- 
ment of small DNA loops described in ((53 : (ssl ) is thus far 
unique to the extent that it accounts for sequence- and 
protein-dependent conformational and flexibility param- 
eters, thermal fluctuations, and helical phasing effects. 
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B. DNA loop model 

The protein subunits that mediate loop formation are 
modelled as two identical and connected rigid bodies, as 
shown in figure [71 ((ssh. There are three additional sets of 
rigid-body rotation angles that are defined in addition to 
those for dinucleotide steps: two sets for the interfaces 
between protein and the last (DP) and first bps (PD) of 
the DNA and one set for the interface between the two 
protein domains (PP), where the symbols in parentheses 
are used to indicate the corresponding angles through 
subscripts. The local Cartesian-coordinate frames for 
protein subunits are defined such that their origins co- 
incide with vertices of a circular chain and their z-axes 
point toward the next vertex in succession. Thus protein 
dimensions can be modelled in terms of a non-canonical 
value for the helix rise corresponding to particular seg- 
ments within a circular polymer chain. 

Angles are expressed in degrees, and length in units of 
the DNA helical rise, i^p = 3.4 A. All calculations used 
canonical mechanical parameters for duplex DNA: helical 
twist tq ~ 34.45°, a sequence- independent twist-angle 
standard deviation, or twisting ficxibility, ar = 4.388°, 
and standard deviations, or bending flexibilities, for all 
tilt and roll angles, ag and a^, respectively, of 4.678° 
(equivalent to a persistence length of 150 bp). Except for 
specific cases where intrinsic DNA bending is considered, 
the average values of tilt and roll arc taken to be zero. 



C. Simplified protein geometries and flexibility parameters 

For DNA loops with either zero or nonzero end-to-end 
distances, constraints are directly applied to the DNA 
ends, as in the case of DNA cyclisation. We modelled 
DNA loops formed during site synapsis using protein- 
dependent parameters roll ~ (pop = 0p_d = 90° and 
twist = T£ip ~ Tpo = 34.45°. The angle was consid- 
ered an adjustable parameter that we denote the axial 
angle and, unless specified, all other protein-related an- 
gular parameters were set equal to 0° . In these cases the 
DNA ends (the centres of two protein-binding sites on 
DNA) are separated by twice the protein-arm length Ip 
and displaced from one another along the +x direction, 
or toward the major groove of DNA. Projected along the 
X-axis, the axial angle is the included angle between the 
tangents to the DNA at the two protein binding sites and 
is altered by varying the twist between protein subunits 
(figure [7] b, c). An axial angle equal to 0° corresponds to 
antiparallel axes at the ends as shown in figure [7^. The 
case of a rigid protein assembly is modelled by setting 
the standard deviations of the DP, PP, and PD sets of 
rigid-body rotation angles to 1 • 10^^ dcg. 




FIG. 7 Rigid-body models for studies of protein-mediated 
DNA looping, (a) A prototype 137-bp DNA loop generated 
by interactions with a pair of rigid, DNA-binding protein sub- 
units is shown. DNA bps are represented by rectangular slabs 
(red) with axes (blue) that indicate the orientation of the lo- 
cal Cartesian coordinate frame whose origin lies at the centre 
of each bp. Two sets of coordinate axes (green) represent 
the local coordinate frames embedded in the protein subunits 
(gold ellipsoids) that mediate DNA looping. The coupling 
of protein and DNA geometry is characterised by tilt, roll, 
and twist values for the DNA-protein, protein-protein, and 
protein-DNA interfaces. Three of these variables are shown 
here: the DNA-protein roll angle, 4>dp\ the protein-protein 
twist angle, rpp; and the protein-DNA roll angle, 0pd. (b) 
Prototype 179-bp loop with protein-protein twist angle, rpp, 
equal to —60 degrees. The view is from the base of the loop 
toward the DNA apex, (c) Loop conformation shown in (b) 
viewed from the side, perpendicular to the loop dyad axis. 



D. DNA loops having zero end-to-end distance and 
antiparallel helical axes 

DNA loops containing N bps in which the two ends 
meet in an antiparallel orientation can be empirically de- 
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scribed by the following formula: 



Tilt : e, = -A, cos(180 + 5) 
Roll: = A, sin(180 + (5) 



(2) 



Twist : Tj ~ T 







where is the intrinsic DNA twist and 5 an arbitrary an- 
gle related to the unconstrained torsional degree of free- 
dom of DNA. The coefficients Ai are given by 



A, 



1 

TV 



/ 



N - 1 



0,. 



,7V- 1 



with 



where 



.9(0.5 - x) 



< X < 0.5 
0.5 < a: < 1 



1=1 



< .T < 0.5 



(3) 



(4) 



(5) 



The coefficients in equation ([S]) were obtained by fit- 
ting the space curve corresponding to the DNA heli- 
cal axis that gives the minimum elastic energy confor- 
mation of DNA loops of different sizes and are as fol- 
lows: ao = -335.0142, ai = 2318.881, = -1299.164, 
ag = -4483.366, 04 = 38169.74, 05 = -54753.5. The 
error for end-to-end distances computed using equation 
dH) is less than 2% of DNA length from 50 bp to 100 bp, 
and less than 0.5% from 100 bp to 500 bp. The torsional 
phase angle between two ends is ^ = — (iV — 2) t — 25. 
The entire loop lies in a plane, and the angle between 
the normal vector of the plane and the x-axis of the ex- 
ternal coordinate can be shown to he ip = 180 + t — 5. 
The expressions for £^ and V suggest that i5 is related to 
DNA bending isotropy. Loop configurations with differ- 
ent 5 values are related to each other by globally twisting 
DNA molecules. Since the orientation of the first bp is 
fixed, this global twist is equivalent to rotation of the 
loop plane, which corresponds to the rotational symme- 
try met in DNA cyclisation of homogeneous DNA with 
bending isotropy (j52l ). Therefore, J- factors for configu- 
rations with different 5 values are identical. 

If DNA looping needs to be torsionally in-phase, only 
two degenerate loop configurations are available, break- 
ing the rotational symmetry. These loop geometries can 
be expressed by equation ^ with two different 5 values: 
5^ = -{N - 2)r/2 and 62 = 180 - {N - 2)t/2, which 
satisfy the torsional phase requirement ^ — 360 ■ n, n ~ 
0,±1,±2,... In contrast to DNA cyclisation, no twist 
change is involved in forming these ideal DNA loops for 
any DNA length and thus the helical dependence van- 
ishes in this case. From the expression given above for ip 
it is clear that the helical axes of the two loops are coin- 
cident and their directions are reversed. Figure [5] shows 
the bending profile of the loop configuration correspond- 
ing to ^1 for a 150 bp DNA. Surprisingly, the maximal 
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FIG. 8 Conformation of an antiparallel, 150-bp DNA loop 
with zero end-to-end distance, (a) Computed space-filling 
model of the loop generated with SDN A (js^). The ends of 
the DNA juxtapose exactly with antiparallel helical axes and 
exact torsional phasing, (b) Equilibrium roll and magnitude 
of the loop shown in (a). The bending magnitude of each 
dinucleotide step is defined as y/O^ -\- (jil where 9i and (f>i are 
the tilt and roll of i-th dinucleotide step, respectively. 



J-factor occurs at approximately the same DNA length, 
or 460 bp (data not shown), as in DNA cyclisation |52)- 
This can be partly explained by the fact that the total 
bending magnitude of the loop is 290 degrees, close to a 
full circle, instead of 180 degrees. 



E. DNA looping with finite end-to-end distance, 
antiparallel helical axes, and in-phase torsional constraint 

Separation of the DNA ends breaks the rotational sym- 
metry, restoring the dependence on helical twist. Figure 
[5^ shows the J-factor as a function of DNA length for 
end-to-end distances of 10 bp and 30 bp. The helical 
dependence increases with end-to-end separation. Start- 
ing from the two loop configurations (corresponding to i5i 
and with zero end-to-end distance and in-phase tor- 
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FIG. 9 The DNA-length-dependent J-factor and loop config- 
uration as a function of end-to-end separation (the J-factor is 
defined in section |III.A|) . (a) The helical dependence of DNA 
looping is shown for values of the end-to-end separation equal 
to 10 bp and 30 bp. The two configurations for the 10-bp sep- 
aration are obtained from corresponding configurations with 
zero end-to-end separation by using an iterative algorithm. 
Therefore the two configurations are designated by the initial 
configurations with phase angles S = — (A'^ — 2)r/2 + (0°, 
dashed line) and 5 = -(A - 2)r/2 + 180 (180°, solid line) as 
described in the text, (b) and (c) show stereo models of the 
two equilibrium configurations for 210-bp (b) and 215-bp (c) 
antiparallel DNA loops with end-to-end separation equal to 
10 bp. The 210- and 215-bp DNA correspond to an adjacent 
peak and valley of the curve in (a), respectively. Conforma- 
tions shown in blue correspond to 5 = 0; those shown in red 
are for S = 180°. Note that for A-bp DNA, the chain contour 
length is equal to (A — l)lbp. 



sional alignment as initial configurations, two mechani- 
cal equilibrium configurations are obtained by using the 
iterative algorithm described in ((s^. The J-factor in 
figure [5^ is the sum of separate J-factors calculated for 
the two configurations. Note that in all cases involving 
configurations that differ in linking number, equilibration 
between the two forms requires breakage of at least one 
of the protein-DNA interfaces. The contributions from 
each of these configurations are shown in detail for the 
case where the ends are separated by 10 bp. Interestingly, 
the length dependence of J computed from the individual 
configurations are out of phase and have a periodicity of 
2 helical turns, which results from the half-twist depen- 
dence of the phase angles i5i and 62- However, their sum 
displays a periodicity of one helical turn. Figures [Hlb and 
c show two such configurations for DNA molecules that 
arc torsionally in-phase (A^ = 210 bp) or out-of-phase 
{N = 215 bp). 

In the case of cyclisation, the helical-phase dependence 
of the J-factor persists at DNA lengths well beyond that 
corresponding to the maximum value of J, which lies near 
500 bp. This is clearly not the case for DNA looping. In 
figure in^, the periodic dependence of J on DNA length 
for 10-bp end-to-end separation decays nearly to zero well 
before the maximum J value is reached. Although the 
periodicity of J is not attenuated quite as strongly for 30- 
bp end separation, there is less than four-fold variation 
in the value of J near 300 bp, as opposed to the more 
than ten-fold variation in cyclisation J-factors expected 
in this length range. The differences between looping 
and cyclisation are largely due to substantial differences 
in the relative contributions of DNA writhe in the two 
processes, as discussed below. 



F. DNA looping in synapsis 

Intramolecular reactions of most site-specific recombi- 
nation systems ((46,; ,48i) and a number of DNA restric- 
tion endonucleases such as SfiL and NgoMlY (|49[ ). pro- 
ceed through protein-mediated intermediate structures in 
which a pair of DNA sites are brought together in space 
and the intervening DNA is looped out. The intermediate 
nucleoprotein complex involved in site pairing and strand 
cleavage (and also exchange, in the case of recombinases) 
is termed the synaptic complex. In these systems, two 
characteristic geometric parameters are of interest: the 
average through-space distance between the sites and the 
average crossing angle between the two ends of the loop, 
which we denote the axial angle (see section UlI.Cp . The 
latter quantity can be described in terms of the twist an- 
gle between the protein domains, rpp (figurelJla), and we 
use these terms interchangeably. 

Figure [TUl shows the hehcal dependence of looping (fig- 
ure [TOk ) and the elastic- minimum configuration of DNA 
loops (figure [TUb) for different values of the axial angle. 
The most prominent feature of these results is that the 
phase of the helical dependence is shifted as a function 
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of the axial angle, characterised by a relative global shift 
of the curve along the a;-axis. This implies that DNA 
looping does not always occur most efficiently when two 
sites are separated by an integral number of helical turns, 
as has been suggested for some simple DNA looping sys- 
tems studied previously. The axial angle also globally 
modulates J-factors, which is apparent from the verti- 
cal shift in the J versus length curve and effects on the 
amplitude of the helical dependence. The torsion-angle- 
independent value of J, averaged over a full helical turn, 
decreases with increasing axial angle, whereas the am- 
plitude of the helical dependence increases. The above 
observations can be qualitatively explained by analogous 
results from DNA cyclisation. As in cyclisation, DNA 
forms loops most efficiently when the number of helical 
turns in the loop is close to an integer value. It is there- 
fore appropriate to consider this issue in terms of the 
linking number for the looped conformation, Lk, which 
involves contributions from the geometries of both the 
protein and DNA. 

We define the loop helical turn Ht^ioop as the sum of 
the DNA twist and the twist introduced by the protein 
subunits, divided by 360. Therefore, changing the twist 
angle, the axial angle will shift the phase of the helical 
dependence relative to that of the DNA alone. For a loop 
with N = 179 bp and tpp = 0, the total twist is simply 
equal to that for the DNA loop. Because this loop has 
17.0 helical turns, only one loop topoisomer contributes 
to the J-factor. The value of J is a local maximum at 
Tpp = and; as shown in figure [TTk . decreases mono- 
tonically for both Tpp > and Tpp < 0. Contributions 
to J from other topoisomers of the 179- bp loop are less 
than 5 percent over the range —135° < Tpp < +120°. 
The twist for the planar equilibrium conformation of a 
173-bp loop is 16.5 helical turns; thus there arc two alter- 
native loops that can be efficiently formed (figure [TTk): 
either a loop with Ht^ioop ~ 17.0 and Tpp > 0, or a loop 
with Hf^ioop = 16.0 and Tpp < 0. The J value at Tpp = 
is a local minimum and there is a bimodal dependence 
on axial angle for loops in which the DNA twist is half- 
integral. We investigated the phase shift of the J-factor 
and found that this quantity is a non-linear function of 
the axial angle. From figure [TUb . the calculated phase 
shifts for 60° and 120° axial angles relative to 0° are ap- 
proximately 52° and 103°, respectively. Moreover, the 
local maxima for the total J curve for = 173 shown in 
figure [TTb are located at —58.5° and 63°, positions that 
are not in agreement with predicted angle values based 
solely on Ht^ioop (—166° and 194°, respectively). 

These deviations can be explained by the fact that 
writhe makes an important contribution to the overall 
Lk for the loop. This aspect of DNA looping is dra- 
matically different from that in the cyclisation of small 
DNA molecules. The conformations of small DNA cir- 
cles are close to planar and the writhe contribution is 
small relative to DNA twist (jH; [13; [13; [Ml) . In the case 
of protein-mediated looping, nonzero values of the axial 
angle impose an intrinsically nonplanar conformation on 
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FIG. 10 Dependence of the J-factor on axial angle (the J- 
factor is defined in section llll.AI and the axial angle is defined 
as the average crossing angle between the two ends of the loop, 
see section |III.C|I . (a) DNA-length dependence of J for axial 
angles of 0° , 60° , and 120° with the end-to-end separation set 
equal to 40 bp. Note that the positions of the extrema shift to 
the left with increasing values of the axial angle, (b) Stereo 
models of minimum elastic-energy conformations of 179-bp 
loops colour coded in accord with the corresponding axial- 
angle values in (a). 



the DNA. The relative contributions of loop writhe and 
twist for the Lk = 16 topoisomer of a 173-bp loop are 
shown function of axial angle in figure [TTb . 

In figure [TTb. we plot the axial- angle-dependent values 
of the bending and twisting free energies for the Lk ~ 16 
topoisomer and their sum, which is the total elastic-free 
energy of the loop. The minimum value of the total elas- 
tic energy occurs at Tpp = —58.5°, coincident with the 
position of the J-factor maximum for this topoisomer 
(figure [TTb). This mechanical state can be achieved with 
very little twist deformation of the loop, but at the ex- 
pense of significant bending energy. Further reduction of 
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FIG. 11 J-factor, loop-geometry parameters, and elastic-free 
energies as functions of axial angle; compare figure 1101 (a) 
J-factor values for loop topoisomers corresponding to 179-bp 
and 173-bp loops in figure [TO] The principal contribution to 
J for A'^ = 179 bp comes from a single loop topoisomer with 
Lk — 17. For A'^ = 173 bp, the overall J-factor is the sum 
of contributions from two loop topoisomers with Lk values 
of 16 and 17, generating a bimodal dependence of J on axial 
angle as described in the text, (b) Excess helical twist, AHt, 
and writhe of the loop formed by the Lk = 16 topoisomer 
for TV — 173 bp as a function of axial angle. Excess twist 
is computed from the expression Ht.ioop — 16, where Ht.ioop 
is the loop helical turn value described in the text, and de- 
pends linearly on the axial angle. The writhing number of the 
loop was calculated using the method of Vologodskii |65l : Issh. 
(c) Elastic-free energies of the Lk — 16 loop topoisomer for 
A'^ = 173 bp calculated according to equation 38 of Zhang and 
Crothers ((53). The individual contributions of bending and 
twisting energies are shown along with their sum. 



the axial angle requires even less twisting energy; how- 
ever, the bending energy increases monotonically. In con- 
trast, for Tpp > —58.5°, somewhat less bending energy is 
required, but the twisting energy begins to increase sig- 
nificantly with increasing axial angle. Since the sense of 
the bending deformation for Tpp > opposes the needed 
reduction in loop linking number, the elastic energy can- 
not be decreased by increasing the axial angle. The only 



way that the loop geometry can compensate for this is 
through twist deformation. This asymmetry arises be- 
cause we are considering the contribution of only one 
loop topoisomer to the elastic free energy. 

G. Conclusion 

The statistical-mechanical theory for DNA looping dis- 
cussed above (|52| ; [ssh suggests that the helical depen- 
dence of DNA looping is affected by many factors and 
leads to the conclusion that whereas a positive helical- 
twist assay can often confirm DNA looping, a negative 
result cannot exclude DNA looping. Since it is difficult 
to explore the architecture of DNA loops with current 
experimental techniques, this theory will be useful for 
more reliably analysing DNA looping with limited exper- 
imental data. The model has advantages over previous 
approaches based exclusively on DNA mechanics, partic- 
ularly when protein flexibility is taken into account. In 
these cases, entropy effects become important and are 
responsible for the observed decay of looping efficiency 
with DNA length. 

IV. DNA KNOTS AND THEIR CONSEQUENCES: 
ENTROPY AND TARGETED KNOT REMOVAL 

Bacterial DNA occurs largely in circular form. No- 
tably, instead of a simply connected ring shape (the un- 
knot), the DNA often exhibits permanently entangled 
states, such as catenated and knotted DNA. An example 
for a DNA trefoil knot is shown in figure [T^l Such config- 
urations have potentially devastating effects on the cell 
development. Conversely, however, knots might have de- 
signed purposes in gene regulation, separating different 
regions of the genome, or, alternatively, locking chemi- 
cally remote parts of the genome proximate in geomet- 
rical space. In eukaryotic cells additional topological ef- 
fects occur in the likely entanglement of individual chro- 
mosomes. Here, we concentrate on the prokaryotic case. 

A. Physiological background of knots 

The discovery how one can use molecular biological 
tools to create knotted DNA resolved a long-standing ar- 
gument against the Watson-Crick double helix picture of 
DNA ([l^ , namely that the replication of DNA could not 
work as the opening up of the double helix would pro- 
duce a superstructure such that the two daughter strands 
could not be separated. In fact, the topology of both ss- 
DNA and dsDNA is continuously changed in vivo, and 
this can readily be mimicked in vitro, although the ac- 
tivity of enzymes in vivo is much more restricted than in 
vitro ((ssl ; \8w : Different concentrations of enzymes ver- 
sus knotted DNA molecules accessible in vitro, that is, 
makes it possible to probe topology-altering effects by 
enzymes which in vivo do not contribute to such effects. 




FIG. 12 Electron microscope image of a DNA trefoil knot, 
from IsJ). © Science, with permission. 



Although it would be likely with a probability of 
roughly i that the linear DNA injected by bacterio- 
phage A into its host E.coli would create a knot before 
cyclisation, it turned out to be difficult to detect (fl^. 
First studies therefore concentrated on the fact that un- 
der physiological conditions knots are introduced by en- 
zymes, DNA replication and recombination, DNA repair, 
and topoisomerisation, using these enzymes to prove 
both knotting and unknotting (QillSiSSSISll;®- 
DNA-knotting is also prone to occur behind a stalled 
replication fork l|93l : IM ) . Some of the typical topology- 
altering reactions undergoing in E.coli are summarised 
in figure [T31 Knots can efficiently be created from 
nicked^^ dsDNA under action of topoisomerase I at non- 
physiological concentrations (fosl ). Another possibility is 
by active packaging of a DNA mutant into phage cap- 
sids (|96| ). and then denaturing the capsid proteins. Both 
methods produce a distribution of different knot types. 
They can be separated by electrophoresis ((qtI). 

The existence of DNA-knots has far-reaching effects 
on physiological processes, and knottedness of DNA has 
therefore to be eliminated in order to maintain proper 
functioning of the cell. Among other possible effects, it 
is immediately clear that the presence of a knot in a cir- 
cular DNA impedes replication of the DNA, i.e., the full 
separation of the two daughter strands ([l|; [H). More- 
over, even transcription is impaired (josl ). The presence 
of knots inhibits the assembly of chromatin (fool ) , knotted 
chromosomes cannot be separated during mitosis ([ll), and 
knots in a chromosome may serve as topological barri- 
ers between different sections of chromosomes, such that 
the genomic structural organisation is altered, and cer- 
tain sect ions of the chromosomal DNA may no longer 
interact (jlOOl ). Conversely, it is conceivable that knots, 
analogously to protein induced DNA looping, lock re- 
mote segments of the genome close together in geometric 
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FIG. 13 Enzymes changing the topology of dsDNA by cutting 
and pasting of one or both strands (example for E. coli ) : (A) 
Torsional stress resulting from the Lk deficit causes the DNA 
double helix to writhe about itself (negative supercoiling). In 
E.coli , gyrase introduces negative supercoils into DNA and 
is countered by topoisomerase I (topo I) and topo IV, which 
relax negative supercoils. (B) Topo IV unlinks catenanes gen- 
erated by replication or recombination in vivo. (C) Topo IV 
unknots DNA in vivo. After (jsBh. 



space. Finally, knots may lead to double-strand breaks, 
as they weaken biopolym ers considerably due to creation 
of localised sharp bends (flOll: [lo3 [Toa [lol as well as 
macroscopic lines and ropes^O^^^ 

Above we said that knots can be introduced, inter alia, 
by the different enzymes of the topoisomerase family. To 
remove a knot from a dsDNA, it is necessary to cut both 
strands, and then pass one segment through the created 
gap, before resealing the two open ends. In vivo, this 
is usually achieved by topoisomerases II and IV. A re- 
construction of topo II is shown in figure 1141 indicating 
the upper clamp holding a segment of the DNA, while 
the bulge-clamp introduces the cut through which the 
upper segment is passed. In the figure, the segment vis- 
ible in the pocket of the lower clamp has already been 
passed through the gap. After resealing, topo II de- 
taches. This process requires energy, provided by ATP. 
Notably, topo II is extremely efficient, for circular ds- 



One of the two strands is cut. 



The weakness of strings at the site of the knot can be experienced 
easily by pulling apart a linear nylon string in comparison to a 
knotted one l ll02l ). 
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FIG. 14 Topoisomerase II. This enzyme can actively change 
the topology of DNA by cutting the double-strand and pass- 
ing another segment of double-stranded DNA through the gap 
before resealing it. The image depicts a short stretch of DNA 
(horizontally at the bulge of the enzyme, as well as another 
segment in the lower clamp (perpendicular to the image) after 
passage through the gap from the upper clamp. This mech- 
anism makes sure that no a ddit i onal strand passage through 
the open gap can take place (|llOl : llli] ). Figure courtesy James 
M Berger, UC Berkeley. 



DNA of length ~ 10 kbp it was found that topo reduced 
the knotted state in between 50 and 100-fold, in com- 
parison to a 'dumb' enzyme, which would simply pass 
segments through at random (jl06l ). We note that the 
step- wise action of topoisomerase II was rec orded in a 
single molecule setup using magnetic tweezers ( 107 : llOSf ) . 



Topoisomerases are surveyed in the review of (11091 ) 



B. Classification of knots 

Knottedness can only be defined on a closed (circular) 
chain. This is intuitively clear as in an open linear chain 
a knot can always be tied, or an existing knot released. 
Mathematically, this means that knot invariants are only 
well-defined for a closed space-curve. However, a linear 
chain whose ends are permanently attached to one, or two 
walls, or whose ends are extended towards infinity, can 
be considered as (un)knotted in the proper mathematical 
sense, i.e., their knottedness cannot change. In a loser 
sense, we will also speak of knots on an open piece of 
DNA, appealing to intuition. 

The classification of knots, or graphs in general, in 
terms of invariants can essentially be traced back to Eu- 
ler, recalling his graph theoretical elaboration in con- 



nection with the Bridges of Konigsberg problem (jll^ 
determining a closed path by crossing each Konigsberg 
bridge exactly once. However, the first investigations of 
topological problems in modern science is most proba- 
bly due to Kepler, who studied surface tiling to great 
detail (therefore th e no tion of Kepler tiling in mathe- 
matical literature) (jll3[ ). Further initial steps were due 
to Leibniz, Vandermonde and Gauss, in whose collection 
of papers drawings of various knots were found^^ whose 
linking ( ' Umschli ngungen' = wind ings ) number is indeed 
a knot invariant (|ll4l : llist Ill6l ). Gauss' student. List- 
ing, in fact introduced the term 'topology', and his work 
on knots m ay b e viewed as the real starting point of 
knot theory (jllTl ). although his complexions number was 
proved by Tait not to be an invariant. 

Inspired by Hclmholtz' theory of an ideal fluid and 
building on Listing's early contributions to knot the- 
ory, Scotsmen and chums Maxwell, Tait and Thom- 
son (Lord Kelvin) started to discuss the possible im- 
plications of knottedness in physics and chemistry, ul- 
tima. t ely d istilled into Thomson's theory of vortex atoms 
()ll8l : lligf ). Out of this endeavour emerged Tail's inter- 
est in knots, and he devoted most of his career on the 
classification of knots. Numerous charts and still un- 
resolv ed conjec t ures on k nots document his pioneering 
work dHO; [Ull; HH [l23l). The studies were carried on 
by Kirkman and Little (|l24l : [1251 : [1261 : [ml ). A more de- 
tailed historical account of knot the ory m ay be found in 
the review article by van de Griend l|l28l ). and on the St. 
Andrews history of mathematics webpages^^. 

Planar projections of knots were rendered unique by 
Listing's introduction of the handedness of a crossing, 
i.e., the orientational information assigned to a point 
where in the projection two lines intersect. With this 
information, projections are the standard representation 
for knot studies. On their basis, the minimum number of 
crossings ('essential crossings') can be immediately read 
off as one of the simplest knot invariants. To arrive at 
the minimum number, one makes use of the Rcidemcister 
moves, three fundamental permitted moves of the lines 
in a knot projection, as shown in figure 1151 More com- 
plex knot invariants include polynomi als o f the Alexan- 
der, Kauffman and HOMFLY types ^1^, [ill; [IH).^^ 
Here, we will only employ the number of essential cross- 
ings as classification of knots, in particular, we do not 
concern ourselves with the question of degeneracy for a 
given knot invariant. However, the bookkeeping of knot 



URL: 



Probably copies from an English original. 
The MacTutor History of Mathematics archive, 
http: / /turnbull .mcs . st-and. ac .uk~hi story/ 
These polynomials all start to be degenerate for higher order 
knots, i.e., above a certain knot co mplexity several knots may 
correspond to one given polynomial l|ll4l : lll5r . In the case of the 
simpler knots attained in most DNA configurations and in knot 
simulations, the Alexander polynomials are unique, in cont rast to 
the Gauss or Edwards invariant, compare, e.g., reference (Il29l ). 




FIG. 15 The three Reidemeister moves. AU topology- 
preserving moves ol a knot projection can be decomposed 
into these three fundamental moves. 

types is vital in knot simulations. 

C. Long chains are almost always entangled. 

During the polymerisation and final cyclisation of a 
polymer grown in a solvent under freely floating condi- 
tions, a knot is created with proba bility 1 . T his Frisch- 
Wassermann-Delbriick conjecture (|l30l : Il3lh coul d be 
mat hematically pro ved f or a self-avoiding chain (jl32l : 
Il33l ). compare also (|l34[ ). This is consistent with nu- 
merical findings that the probability of unk not forma - 
tion decreases dramatically with chain length l|l29l : Il35[ ). 
Indeed, recent simulations results indicate that the prob- 
ability of finding the unknot in such a cyc lisised po l ymer 
deca ys exponentially with chain length (jl3d Il37l : Il38t 

P0(7V)cxexp(^-^^ . (6) 

However, there exist theoretical arguments and simula- 
tions results indicating that the characteristic number 
of monomers Nc o ccur r ing i n th i s rel ation may become 
surprisingly large (fliol : [itll : [lil [Tisl) . The probability 
to find a given knot type /C on random circular poly- 
mer formation has been fitted with the functional form 

P^{N)^a{N~ No)' cxp(^-^y (7) 

where a, b, and d are free parameters depending on K., 
and c « 0.18. A'o is the minimal number of segments 
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FIG. 16 Figure-eight structure, in which a slip-link separates 
two loops of size n and N — n, such that they can freely 
exchange length among each other, but none of the loops 
can completely retract from the slip-link. On the right, a 
schematic drawing of a shp-link, which may be thought of as 
a small belt buckle. 

requ ired to form a knot /C, without the closing segment 
(jl45r ). The tendency towards knotting during polymer 
cyclisation creates problems in industrial and laboratory 
processes. 

D. Entropic localisation in the figure-eight slip-link 
structure. 

To obtain a feeling for how and when entropy leads 
to the localisation of a permanently entangled structure, 
we consider the simplest polymer object with non-trivial 
(non-unknot) geometry, the figure-eight structure (F8) 
displayed in figure 1161 In this compound, a pair con- 
tact is enforced by a slip-link, separating off two loops 
in the circular polymer, such that none of the loops can 
fully retract, and both loops can freely exchange length 
among each other. We denote the loop sizes by n and 
N — n, where N is the (conserved) total length of the 
polymer chain. For such an object, we can actually per- 
form a closed statistical mechanical analysis based on 
results from scaling theory of polymers, and compare the 
result with Monte Carlo simulations of the F8. 

The statistical quantities that are of particular interest 
are the gyration radius, Rg, and the number of degrees of 
freedom, lu (|l46l ). Rg, as defined in equation (pT|) . mea- 
sures the root mean squared distance of the monomers 
along the chain to the gyration centre, and is therefore 
a good measure of its extension. It can, for instance, be 
measured by light scattering experiment. The degrees of 
freedom lu count all possible different configurations of 
the chain. For a circular polymer (i.e., a polymer with 
r(0) = r(7V)), the gyration radius becomes 

Rg ~ AN'' (8) 

with exponent i' = 1/2 for a Gaussian chain, and = 
0.588 in the 3D excluded volume case {ly = 3/5 in the 
Flory model, and i/ = 3/4 in 2D). Whereas in 2D this 
scaling contains truly a ring polymer, in 3D the exponent 
J/ emerges from averaging over all possible topologies, and 
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necessarily includes knots of all types l|146t 11471 : 1148 
For a circular chain, the number of degrees of freedom 
contains the number of all possible ways to place an iV- 
step walk on the lattice with connectivity fi (e.g., ^ = 2d 
on a cubic lattice in d dimensions), /i^, and the entropy 
loss for requiring a closed loop, N^'^'^ , involving the same 
Flory exponent ly. For the Gaussian case, we recognise in 
this entropy loss factor the returning probability of the 
random walk. In the excluded volume case, N~'^'^ is an 



analogous measure ([2^; 11461 : llSOf ). Thus, for a circular 
chain embedded in d-dimensional space, the number of 
degrees of freedom is 



(9) 



Let us evaluate these measures for the F8 from figure [TBI 
As a first approximation, consider the F8 as a Gaus- 
sian (phantom) random walk, demonstrating that, like 
in the charged knot case (|l49h . entropic effects give rise 
to long-range interactions. The two loops correspond to 
returning random walks, i.e., the number of degrees of 
freedom for t he F8 in the phantom chain case becomes 
([2i: [T46l : IT5ll ) 
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WF8,PC 



(10) 



where d is the embedding dimension. We note that nor- 
malisation of this expression produces the probability 
density function for finding the F8 with a given loop size 

£ = na {L = Na), 



-d/2 



(11) 



where ^ denotes a normalisation factor. The conversion 
from expressing the chain size in terms of the number of 
monomers to its actual length is of advantage in what fol- 
lows, as it allows to more easily keep track of dimensions. 
Here, we use the length unit a, which may be interpreted 
as the monomer size (lattice constant), or as the size of 
a Kuhn statistical segment. 

To classify different grades o f localisat ion, we follow 
the convention from references (|l52l : Il53l ) . The average 
loop size {£) determined through {£) = ip{t)di is 

trivially {I) = L/2 by symmetry of the structure. Here, 
we introduce a short-distance cutoff set by the lattice 
constant a. However, as the probability density func- 
tion is strongly peaked at £ = and £ = L, the two 
poles caused by the returning probabilities, and there- 
fore a typical shape consists of one small (tight) and one 



^® Here and in the following we consider two configurations of a 
polymer chain different if they cannot be matched by transla- 
tion. In addition, the origin of a given structure is fixed by a 
vertex point (see below), i.e., a point where several legs of the 
polymer chain are joint. In the F8-structure, this vertex natu- 
rally coincides with the slip-link. For a simply connected ring 
polymer, such a vertex is a two- vertex anywhere along the chain. 




FIG. 17 Bead-and-tether chain used in Monte Carlo simula- 
tion, showing a typical equilibrium configuration for a self- 
avoiding chain: the localisation of the smaller loop is distinct. 
Note that in this 2D simulation the slip-link is represented by 
the three tethered black beads. 



large (loose) loop, compare figure [T71 This can be quan- 
tified in terms of the average size of the smaller loop, 



»L/2 

{i)< = 2 1 ep(e)di. 



In c? = 2, we obtain 



L 



\\og(a/L)\' 



(12) 



(13) 



such that with the logarithmic correction the smaller loop 
is only marginally smaller than the big one. In contrast, 
one observes weak localisation 



^1/2^1/2 



(14) 



in d = 3, in the sense that the relative size {i)<^/L tends 
to zero for large chains. By comparison, for d > 4 one 
encounters (£)< ^ a, corresponding to strong localisation, 
as the size of the smaller loop does not depend on L 
but is set by the short-distance cutoff a. Above four 
dimensions, excluded volume effects become negligible, 
and therefore both Gaussian and self-avoiding chains are 
strongly localised in d > 4.^^ 

To include self-avoiding interactions, we make use of 
results fo r general polymer networks obtained by Du- 
plantier (jl47l : Il54l ). which are summarised in the ap- 
pendix at the end of this review. In terms of such net- 
works, our F8-structure corresponds to the following pa- 
rameters: the number A/" = 2 of polymer segments with 
lengths si = £ = na and S2 = L — £ = (N — n)a, form- 
ing C ~ 2 physical loops, connected by 714 = 1 vertex 



Consideration of higher than the physical 3 space dimensions is 
often useful in polymer physics. 
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of order four. By virtue of equatfon (|79p . the number of 
configurations of the F8 with fixed t follows the scaling 
form 

uj^t) ^fi'^iL- er-^-'Xps {j^) ' (15) 

with the configuration exponent 7f8 = 1 — 2dv + (T4. 
In the limit £ L, the contribution of the large loop 
in equation (fT5|) should not be affected by a small ap- 
pendix, and therefor e sho uld exhibit the regular Flory 
scaling - (L - i)-'^" ([TsslTlSe: 210). This fixes the scal- 
ing behaviour of the scaling function X-ps{x) ^ ^^iFs-i+di^ 
in this limit (x — > in dimensionless variable x), such 
that 

u;F8W-^^^iL~i)-'^T^, e<^L, (16) 

where c = — (7fs — 1 + dv) = dv — 0*4 . Using (74 — —19/16 
and V = 3/4 in d = 2 l|l47t Il54l ). we obtain 

c = 43/16 = 2.6875, d = 2. (17) 

lnd = 3,<T4~ -0.48 dili; [Tli; and 1^ « 0.588, so 
that 

c«2.24. (18) 

In both cases the result c > 2 enforces that the loop of 
length £ is strongly localised in the sense defined above. 
This result is self-consistent with the a priori assumption 
£ <^ L. Note that for self-avoiding chains, in d = 2 the 
localisation is even stronger than in d = 3, in contrast to 
the corresponding trend for ideal chains. 

We performed Monte Carlo (MC) simulations of the 
2D figure-eight structure, in which the slip-link was rep- 
resented by three tethered beads enforcing a sliding pair 
contact such that the loops cannot fully retract (see fig- 
ure [TS]). We used a 2D hard core bead-and-tether chain 
with 512 monomers, starting off from a symmetric initial 
condition with £ = L/2. Self-crossings were prevented by 
keeping a maximum bead-to-bead distance of 1.38 times 
the bead diameter, and a maximum step length of 0.15 
times the bead diameter. As shown in figure [HI the size 
distribution for the small loop can be fitted to a power 
law with exponent c = 2.68 in good agreement with equa- 
tion p7|) . 

An experimental study of entropic tightening of 
a ma croscopic F8-structure was reported in reference 
(|l58f ). There, a granular chain consisting of hollow steel 
spheres connected by steel rods was once twisted and 
then put on a vibrating table. From digital imaging, the 
distribution of loop sizes could be determined and com- 
pared to a power-law with index 43/16 as calculated for 
the 2D excluded volume chain. The agreement was found 
to be consistent (|158() . 
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FIG. 18 Monte Carlo simulation of an F8-structure in 2D: 
loop sizes £ and L — £ as a. function of Monte Carlo steps for a 
chain with 512 monomers. The symmetry breaking after the 
symmetric initial condition is distinct. 




FIG. 19 Power-law fit to the probability density function of 
the smaller loop. The fit produces a slope of -2.68, in excellent 
agreement with the calculated value. 

ted chains. Before going further into the theoretical mod- 
elling of knotted chains, we report some of the results 
based on simulations studies of both Gaussian and self- 
avoiding walks. Such simulations cither start with a 
given knot configuration and then perform moves of spe- 
cific segments, each time making sure that the topol- 
ogy is preserved; or, each new configuration emanates 
from a new random walk, whose correct topology may 
be checked by calculating the corresponding knot invari- 
ant, usually the Alexander polynomial, and created con- 
figurations that do not match the desired topology are 
discarded. We note that it is of lesser significance that 
knot invariants such as the Alexander polynomials in fact 
are no longer unique for more complex knots, because for 
typical chain lengths with the highest probability simpler 



E. Simulations of entropic knots in 2D and 3D. 

Although per sc a Gaussian chain cannot have a fixed topology 
Much of our knowledge about the interaction of knots f^ug j^g phantom character, such simulations introduce a fixed 

with thermal fluctuations is based on simulations of knot- topology by rejecting moves that result in a different knot type. 
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knots are created, for which the invariants are unique. 
For more details we refer to the works quoted below. 

In fact, the fixed topology turns out to have a highly 
non-trivial effect o n ch ains without self-excluded volume. 
As conjectured in (|l59f ). a Gaussian circular chain, whose 
permitted set of configurations is restricted to a fixed 
topology, will exhibit self-avoiding behaviour. This was 
proved in a numerical analysis in (,148i ). The required 
number of monomers to reach this self-avoiding expo- 
nent was estimated to be of the order of 500. Keeping 
this non-trivial scaling of a Gaussian chain at fixed topol- 
ogy in mind, knot simulations on t he b asis of phantom 
Gaussian chains were performed in (|l60l ) , always making 
sure that the configurations taken into the statistics fulfil 
the desired knot topology. 

The dependence of the gyration radius Rg on the knot 
type was investigated for simpler knots in 3D in reference 
(|l39f ). On the basis of the expansion 

i?2 ^A^{l + B^N-^ + CkN-' + o(l/7V)) iV^-^ ^ (19) 

including a confluent correction (|l39l : II6II : Il62| ) in com- 
parison to the standard expression ([5]) , it was found that 
the Flory exponent lyjc is independent of the knot type 
JC and has the 3D value 0.588. This was interpreted via 
a localisation of knots such that the influence of tight 
knots on Rg is vanishingly small. In fact, A is of the 
order of . 5 accordi ng to the investigations in references 
(jl6ll : II62I : Il63l : Il64f ). Based on longe r cha ins in compar- 
ison to reference " (fl39l) . the study of (jl65l ) thus corrobo- 
rates the independence of vjc ~ 0.588 of the knot type /C. 
In recent AFM experiments analysing single DNA knots, 
the Flory scaling Rg ^_N" was confirmed for both simple 
and complex knots (jl66h . 

For the number of degrees of freedom uj)c, it was found 
for the form^^ 

a;^:.A^7V"-V^(^l + ^ + ...) (20) 

with confluent corrections, that while for the unknot with 
a0 K, 0.27 expression (j20p is consistent with the standard 
result ^ ([0.27 - 2]/3 « -0.58 w u), for prime knots 
die = Q^0 + I7 a-nd for composite knots with Nf prime 
components, 

aK = a0+A^/- (21) 

This finding is in agreement with the view that each 
prime component of a knot K, is tightly localised and 
statistically able to move around one central loop, each 
prime component counting an additional factor N of de- 
grees of freedom. The fact that for a chain of finite thick- 
ness the size of the big central loop is in fact diminished 



by the size of the tight knot is a confluent effect, such 
that the confluent exponent A should be related to the 
size distribution of the knot region. Not surprisingly, 
the connectivity factor nic ~ 4.68 was found to be in- 
dependent ofJC, assuming the standard value for a cu- 
bic lattice (jl67l ). Also the amplitude A/c and the expo- 
nent Aic of the confluent correction turned out to be /C- 
independent. We note that a similar analysis in (pseudo) 
2D^° also strongly points towards tight localisation of the 
knot (fl68h . 

In cont rast to the above results, 3D simulations under- 
taken in (|l69f ) (also compare (jl70t )) show the dependence 

Rg ~ Ar3/5(--4/i5 (22) 

of the gyration radius on the knot type, characterised 
by the number C of essential crossings. Rg, that is, 
decreases as a po wer- l aw with C, where the exponent 
-4/15 = 1/3-1/ ([l69h . The functional form ^) was 
derived from a Flory-type argument for a polymer con- 
struct of C interlocked loops of equal length N/C by ar- 
guing that each loop occupies a volume ~ (N/C)^'^, and 
the volume of the knot is given by V ^ C{N/CY^ (i.e., 
assuming that due to self-avoiding repulsion the volume 
of individual loops adds up to the total volume). Equa- 
tion ((22)) then follows immediately. This model of equal 
loop sizes is equivalent to a completely delocalised knot. 
It may therefore be speculated, albeit rather long chain 
sizes of up to 400 were used, whether t he n umerical algo- 
rithm employed for the simulations in (jl69l ) causes finite- 
size effects that, in turn, prevent a knot localisation. We 
note that the Flory-type scaling assumed to derive ex- 
pression (j22p is co nsist ent with a modelling brought for- 
ward in reference (jl7ll ). in which the knot is quantified 
by the aspect ratio in a configuration corresponding to a 
maximally inflated tube with the given topology (i.e., a 
state corresponding to complete delocalisation) . In refer- 
ence (jl69t ). the temporal relaxation behaviour of a given 
knot was also studied. While regular Rouse behaviour 
was found for the case of the unknot, the knotted chains 
displayed somewhat surprising lo ng time contrib utions to 
the relaxation time spectrum (0; Il72l : Il73l : Il74l ). a phe- 
nomenon already pointed out by de Gennes within an ac- 
tivation argument to create fre e vol ume in a tight knot in 
order to move along the chain (|l75l ) . Note that relatively 
lose knots in shorter chains do not appear t o ex hibit such 
extremely long relaxation time behaviour (|l76| ) . 

Simulation of a 3D knot with varying excluded vol- 
ume showed, if only the excluded volume becomes large 
enough, the gyration radius of the knot is independent 
of the knot type (|l77t ). The picture of tight knots is fur- 
ther corroborated in the study by Katritch et al. using 



Note that we changed the exponent by 1 in comparison to the 
original work, making the counting of non-translatable configu- 
rations consistent with the counting convention specified in foot- 
note [16] 



The simulated polymer chain moves in 2D, however, crossings 
are permitted at which one chain passes underneath another. 
In that, the simulated polymers are in fact equivalent to knot 
projections with a certain orientations of individual crossings. 
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a Gaussian chain model with fixed topology to demon- 
strate that the size distribut ion o f the knot is distinctly 
peaked at rather small sizes (jl44j ). 

Apart from determining the statistical quantities Rg 
and ujc from simulations, there also exist indirect meth- 
ods for quantifying the size of the knot region in a knotted 
polymer. One such method is to confine an open chain 
containing a knot between two walls, and measuring the 
finite size corrections of the force-extension curve due to 
the knot size. This is based on the idea that the gyration 
radius for a system depending on more than one length 
scale (i.e., apart from the chain length N) show s above 
mentioned confluent corrections, such that (Il78f) 



when only the largest correction is considered, and in 3D 
A « 0.5 is supposed to be universal (flSlt [l6l [lellTei). 

If this leading correction is due to the argument N^/N 
in the scaling function $, the length scale A'o depends 
on N through the scaling Nq ^ with A = 1 — i. 
From Monte Carlo simulations of a bead and tether chain 
model, it c ould then be inferred that the size of the knot 
scales like dlTSh 



AN^n- BN-'^) (23) 



N,. N\ t = 0.4 ±0.1 



(24) 



This, in turn, enters the force-extension curve /' = G{R') 
with the dimensionless force /' = f AN'^ / [ksT) and dis- 
tance R' = R/{AN^) of the walls, in the form with con- 
fluent correction 



r ^G{R')(l+g{R')N-'^). 



(25) 



From the simulation, t = 0.4 corresponds to the best 
data collapsing, assuming the validity of the scaling ar- 
guments. An argument in favour of this approach is the 
consistency of the exponent t = 0.4 with the inferred 
A = 0.6, which is close to the known value. Note that 
the force-extension of a chain with a slip- link was dis- 
cussed in reference (|l79t ) and shown that a loop sepa- 
rated off by a slip-link is confined within a Pincus-dc 
Gennes blob. We also note that results corresponding 
todelocalisation in force-size relations were repor ted h i 
(jlSd ; Il8lh . An entropic scale was conceived in (|l82f ): 
Separating two chains with fixed topology but allowing 
them to exchange length (e.g., through a small hole in a 
wall) would enable one to infer the localisation behaviour 
of a knot by comparing the equilibrium balance of this 
knot with a slip-link construct of known degrees of free- 
dom until the average length on both sides coincides. 
The preliminary results in (|l82l ) are shadowed by finite- 
size effects of the accessible system size, as limited by 
computation power. The analysis in reference (|l83[ ) of a 
self-avoiding polygon model uses the method of closure 
of a short fragment of the knot and subsequent determi- 
nation of its Alexander polynomial to obtain the scaling 
exponent t = 0.75; in a second variant, the authors find 



a consistent result by a variant of the knot scale method. 
Another recent study uses a more realistic model for a 
polymer chain, namely, a simplified model of polyethy- 
lene; with up to 1000 monomers in the simulation, the 
exponent t « 0.65 is found (and delocalisation is obtained 
in the dense phase) (|184| ). 

Thus, there exist simulations results pointing in both 
directions, knot localisation and delocalisation. As the 
latter may be explained by finite size effects, it seems 
likely that (at least simple) knots in 2D and 3D localise 
in the sense that the knot region occupies a portion of 
the chain that is significantly smaller in comparison to 
the entire chain. In particular, this would imply that the 
average size of the knot region {€) scales with the chain 
length Na with an exponent less than one, such that 



lim 



Na 



= 0. 



(26) 



Below, we show from analytical grounds that such a lo- 
calisation is a natural consequence of interactions of a 
chain of fixed topology with fiuctuations. We note, how- 
ever, that conclusive results for knot localisation may 
in fact come from experiments: Manipulation of single 
chains such as DNA can be performed for rather long 
chains, making it possible to reach beyond the finite-size 
corrections inherent in, e.g., the force-extension simula- 
tions mentioned above. The aforementioned AFM stud- 
ies on single DNA knots indeed reveal knot localisation 
of flattened knots (|l66f ): due to experimental limitations, 
presently only one DNA length was investigated, such 
that the scaling exponent t currently cannot be obtained. 

Before proceeding to these analytical approaches, we 
note that there have also been performed si mulat i ons o f 
knotted chains under non-dilute conditions (jl85l : Il86f ). 
In (pseudo) 2D, these have found delocalisation of the 
knot, i.e., \YmN^oo{() /N = const. We come back to 
these simulations below in connection with the modelling 
of dense and 6-knots. 



F. Flattened knots in dilute and dense phases. 

Analytically, knots are a hard problem to tackle. Sta- 
tistical mechanical treatments of permanently entangled 
polymers are so difficult to treat since topological restric- 
tions cannot be formulated as a Hamiltonian problem but 
appear as hard co nstraints partitioning the phase space 
(0; [m ITssh.^^ A segment of a 3D knot, in other 
words, can move without feeling the constraints due to 
the non-trivial topology of a knotted state, until it actu- 
ally collides with another segment. The accessible phase 
space of degrees of freedom is therefore characterised by 



'^^ For comparison, self-avoidance in 3D is usually treated as a per - 
turbation, i.e., as a "soft constraint", in analytical studies l ll46h . 
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inequalities. 

Consequently, only a relatively small range of prob- 
lems have been treated analy tical l y, sta rting with the 
seminal papers by Edwards (jl89l : Il90l ). in which he 
considers the classification of topological constraints in 
polymer physics. De Gennes addressed the problem of 
tight knot motion along a polymer chain using scaling 
arguments for the activation of free length inside the 
knot region, producing a double- expo nential expression 
for the corresponding time scale (jl75r ). which might ex- 
plain the extreme long-time contributions in the relax- 
ati on time spectrum o f permanently entangled polymers 
(0; [III [III; [iTlflQll ). Some analytical results were ob- 
tained for a pair, or an 'Olympic' gel o f entangled poly- 
mer rings, see for instance, (|l92l : Il93l : 1 1941 : Il95l : Il96f ). 
In a mean field approach based on the Kauffman invari- 
ant thBentropy of knots was investigated in references 
(Il97l : 1 1981 : Il99| ) . Similarly, some statistical propert ies o f 
random knot diagrams were investigated in (|200l : l20ll ). 
However, some insight can be gained on the basis of phe- 
nomenological models, which we will come back to below. 
Here, we continue with an analytical study of flat knots. 

One possibility to treat knotted polymer chains ana- 
lytically is to confine the degrees of freedom of the knot 
to motion in 2D, only. The knot, that is, is preserved, 
as at the crossings the chain is allowed to form an over- 
/underpassing, while the rest of the knot is confined to 
2D. Such a confinement can in fact be experimentally 
realised in various ways. Thus, the chain can be con- 
fin ed be tween two close-by glass slabs, as demonstrated 
in (|202f ): it can be pressed flat on a surface by gravita- 
tion or similar forces, for instance in macroscopic sys- 
tems (|l58l : I203I ): the chain can be adhesively bound to a 
membrane and still reach configurational equi librium, a s 
experimentally shown for DNA in references (|l66l : \204 ). 
Or it can be adsorbed to a mica surface either by APTES 
coating or by providing bivalent Mg ions in solution, as 
shown in figure [201 From such flat knots as discussed 
in the remainder of this section, we will be able to infer 
certain generic features also for 3D knots. 

A flat knot therefore corresponds to a polymer network 
in 2D, but the orientation of the crossings is preserved, 
such that the network grap h actually coincides with a 
typical knot projection (|ll4l : IllSl : III6I ). as shown in fig- 
ureHHon the left. This projection of the trefoil, and sim- 
ilar projections for all knots, displays the knot with the 
essential crossings. A fiat knot can, in principle acquire 
an arbitrary number of crossings by Reidemeister moves; 
for instance, the bottom left segment of the fiat trefoil 
can slide under the vicinal segment, creating a new pair 
of vertices, and so on. However, we suppose that such 



Although a similar statement is true for polymer networks in 
3D, the field theoretical results for their critical exponents are in 
fact obtained as averages over all topologies. For instance, the 
exponent 1/ entering the gyrat ion radius of a a 3D polymer ring 
counts all knotted states l ll47l ). 




FIG. 20 AFM tapping-mode images of flattened, complex 
DNA -knots with approximately 30-40 essential crossings, see 
(|l66l ). The substrate surface used is AP-mica (freshly cleaved 
mica reacted with an amino terminal silane to obtain a pos- 
itively charged surface). The DNA knots used are extracted 
from bacteriophage P4; the DNA is a 11.4 kbp molecule (with 
a 1.4 kbp deletion resulting in a final length of 10 kbp) which 
has two cohesive ends. They are not covalently closed, thus 
no supercoiling is present. The knot adsorbed out of the 3D 
bulk on to the surface is strongly trapped, i.e., the knot is 
'projected' onto the surface without any equilibration. The 
knot appears rather delocalised. Courtesy F. Valle and G. 
Dietler. 



transient additional loops are sufficiently short-lived so 
that we can neglect them in our analysis. Then, we can 
apply results from scaling analysis of polymer networks of 
the most general type shown in figure [7^ see the primer 
in the appendix. We note that from the Monte Carlo 
simulations we performed it may be concluded that such 
additional vertices can in fact be neglected. 



1. Flat knots in dilute phase. 

We had previously found that for the F8-structurc the 
probability density function for the size of each loop is 
peaked at £ ^ and £ ^ L. From the scaling anal- 
ysis for self- avoiding polymer networks, we concluded 
strong localisation of one subloop. For more complicated 
structures, the joint probability to find the individual 
segments with given lengths Si is expected to peak at 
the edges of the higher-dimensional configuration hyper- 
space. Some analysis is necessary to find the characteris- 
tic shapes. Let us consider here the simplest non-trivial 
knot, the (fiat) trefoil knot 3i shown in figure [211 Each 
of the three crossings is replaced with a vertex with four 
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FIG. 21 Flat trefoil knot with segment labels. On the right, 
a schematic representation of a localised flat trefoil with one 
large segment is shown. 

outgoing legs, and the resulting network is assumed to 
separate into a large loop and a multiply connected re- 
gion which includes the vertices. Let £ = Si=i *i tie 
the total length of all segments contained in the multiply 
connected knot region. Accordingly, the length of the 
large loop is sg = L — £. 

In the limit ^ <^ L, the number of configurations of this 
network can be derived in a similar way as in the scaling 
approach followed for the F8. This procedure determines 
the concrete behaviour of the scaling form 

c.„i^M^W„i(l-^,^,^,|,|,^) (27) 

including the scaling function W that depends on alto- 
gether six arguments. The index III is chosen according 
to figure [22l where the flat trefoil configuration in the 
dense phase appears at position III of the scheme (ex- 
plained below). After some manipulatio ns, t he number 
of degrees of freedom yields in the form (|152( ) 

u;iii{l,L)^ li^iL-iy^'l--, (28) 

with the scaling exponent 

c = -(7111 - 1 + di^) - TO, TO = 4. (29) 

Here, m = 4 corresponds to the number of independent 
integrations over the segments Si (i = 1,...,4) of the 
knot region, as we only retain the cumulative size £ — 
Si=i °f knot region. Putting numerical values, we 
find c = 65/16, i.e., strong localisation. 

However, some care is necessary in performing these 
integrations, since the scaling function Wm may exhibit 
non-integrable singularities if one or more of the argu- 
ments Si/£ tend to 0. The geometries corresponding to 
these limits (edges of the configuration hypcrspacc) rep- 
resent contractions of the original trefoil network Qm in 
the sense that the length of one or more of the segments 
Si is of the order of the short-distance cutoff a. If such a 
short segment connects different vertices, they cannot be 
resolved on larger length scales, but appear as a single, 
new vertex. Thus, each contraction corresponds to a dif- 
ferent network Q, which may contain a vertex with up to 
eight outgoing legs. For the flat trefoil knot, there exist 



six different contractions, as grouped in figure [22l around 
the original flat trefoil at position HI. As an example, 
in the top row of figure [22] contraction VI follows from 
the original trefoil HI if the uppermost segment becomes 
very small, and similarly the network VII emanates from 
contraction VI if one of the four symmetric segments be- 
comes very small. For each of these networks, one can 
calculate the corresponding exponent c in a similar way 
as above, leading to the general expression 

c = 2+^njv|Y(rfi'-l) + (|CTAr|-di^)|. (30) 

The (Tat are given in equation ((80)) . In figure [22l the 
various contractions are arranged in increasing exponent 
c. 

Our scaling analysis relies on an expansion in a/£ ^ 1, 
and the values of c determine a sequence of contractions 
according to higher orders in a/£: The smallest value of 
c corresponds to the most likely contraction, while the 
others represent corrections to this leading scaling be- 
haviour, and are thus less and less probable (see figure 
[^ . To lowest order, the trefoil behaves like a large ring 
polymer at whose fringe the point-like knot region is lo- 
cated. At the next level of resolution, it appears con- 
tracted to the figure-eight shape Qi. For more accurate 
data, the higher order shapes II to VII may be found with 
decreasing probability. Interestingly, the original uncon- 
tracted trefoil configuration ranks third in the hierarchy 
of shapes. 

These predictions were checked by MC simulations 
with the same conditions as described above, to pre- 
vent intersection. The flat trefoil knot was prepared 
from a symmetric, harmonic 3D representation with 512 
monomers, which was collapsed and then kept on a hard 
wall by the "gravitational" field V = —kBTh/h* perpen- 
dicular to the wall, where h is the height of a monomer, 
and h* was set to 0.3 times the bead diameter. Configu- 
rations corresponding to contraction I are then selected 
by requiring that besides a large loop, they contain only 
one segment larger than a preset cutoff length (taken to 
be 5 monomers), and similarly for contraction II. The 
size distributions for such contractions, as well as for all 
possible knot shapes are shown in figure [221 The tails of 
the distributions are indeed consistent with the predicted 
power laws, although the data (especially for contraction 
II) is too noisy for a definitive statement. 

Our scaling results pertain to all flat prime knots. In 
particular, the dominating contribution for any prime 
knot corresponds to the figure-eight contraction Qi, as 
equation (|30p predicts a larger value of the scaling ex- 
ponent c for any network Q other than Qi. Accordingly, 
figure [231 demonstrates the tightness of the prime knot 
819. Composite knots, however, can maximise the num- 
ber of configurations by splitting into their prime factors 
as indicated in figure [Ml for 3i#3i. Each prime factor 
is tight and located at the fringe of one large loop, and 
accounts for an additional factor of L for the number of 
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FIG. 22 Hierarchy of the flat trefoil knot 3i. Upper row: dilute phase. Middle row: O-phase. Bottom row: dense phase. To 
the left of each row, the trefoil projection is shown. It splits up into the hierarchies of conflgurations, with exponents c below 
each contraction. The small protruding legs represent the big central loop, compare, for instance, figure [21] on the right with 
position III of the top row. See text for details. 
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FIG. 23 Power law tails in probability density functions for 
the size I of tight segments: As defined in the figure, we 
show results for the smaller loop in a figure-eight structure, 
the overall size of the trefoil knot, as well as the two leading 
contractions of the latter. 



configurations as compared to a ring of length L witliout 
a knot. Indeed, this gain in entropy leads to the tight- 
ness of knots. Flat knots can experimentally be produced 



by 'projecting' a dilute 3D knot from the bulk onto a 
mica surface, on which the knot is adsorbed. Variation 
of the ionic strength in the solution determines whether 
the knot is going to be strongly trapped on the surface 
such that, once captured on the surface, it is completely 
immobilised (small ionic strength); or whether the ad- 
sorption is weaker such that the knot can (partially) equi- 
librate while being confined to 2D, i.e., equilibrate as a 
flat knot. Figure [20] shows a strongly trapped complex 
knot, whereas fi gure [^ depicts a weakly adsorbed simple 
knot, compare (jl66f ). 



2. Flat knots under Q and dense conditions. 

In many situations, polymer chains are not dilute. 
Polymer melts, gels, or rubbers exhibit fairly high densi- 
ties of chains, and the behaviour of an individual chain 
in such syste nis is significan tly different compared to the 
dilute phase (|l46l : Il72l : Il9l[ ). Similar considerations ap- 
ply to biomolecules: in bacteria, the gyration radius of 
the almost freely floating ring DNA may sometimes be 
larger than the cell radius itself. Moreover, under certain 
conditions, there is a non- negligible osmotic pressure due 
to vicinal layer s of protein m olecules, which tends to con- 
fine the DNA (|205l : l206l : l207l ). In protein folding studies, 
globular proteins in their native state are often modelled 
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FIG. 24 Typical configurations of 256-mer chains for the tre- 
foil 3i, the prime knot 819, and the composite knot 3i#3i 
consisting of two trefoils, in ci = 2. The initial conditions 
were symmetric in all cases. 

as compact polymers on a lattice (see ([20^ for a recent 
review) . 

A polymer is considered dense if, on a lattice, the frac- 
tion / of occupied sites has a finite value / > 0. This 
can be obtained by considering a single polymer of total 
length L inside a box of volume V and taking the limit 
L — s- 00, y — > 00 in such a way that f = L/V remains 
finite (|209t l210t l21lh . Alternatively, dense polymers can 
be obtained in an infinite volume through the action of 
an attractive force between monomers. Then, for tem- 
peratures T below the collapse (Theta) temperature &, 
the polymers collapse to a dense phase, with a density 
/ > 0, which is a function of T ([210; 212- 113,; .214) . 
For a dense polymer in d dimensions, the exponent v, 
defined by the radius of gyration Rg ~ L" , becomes 
i> — 1/d. The limit / = 1 is realised in Hamiltonian 
paths, where a rand om walk visits every site of a given 
lattice exactly once ()215l : |216|) . Dense polymers may be 




FIG. 25 Flat knot imaged by AFM, similar to the one shown 
in figure 1201 However, this knot is rather simple (likely a 
trefoil) and was allowed to relax while attaching to freshly 
cleaved mica in the presence of bivalent Mg counterions. 
Courtesy E. Ercolini, J. Adamcik, and G. Dietler. Note the 
close resemblance to the trefoil configuration shown in figure 

related to 2D ve s icles and lattice animals (branched poly- 
mers) (I217I : El [2191 : 122(11 ). 

As studied in reference (|22ll ) , the value of the exponent 
c for the 2D dense F8 is (compare to the appendix) 

c= -7F8 = 11/8= 1.375, (31) 

implying that the smaller loop is weakly localised. This 
means that the probability for the size of each loop is 
peaked at = and, by symmetry, at £ = L. An analo- 
gous reasoning for the 2D F8 at the 6 point gives 

c== 11/7 = 1.571. (32) 

In both cases the smaller loop is weakly localised in the 
sense that (Z)</_L 0. Figure [26] shows the symmet- 
ric initial and a typical equilibrium configuration for pe- 
riodic boundary conditions obta ined from Monte Carlo 
(MC) simulations, see reference (|22lh for details. In fig- 
ure [26l the lines represent the bonds (tethers) between 
the monomers (beads, not shown here). The three black 
dots mark the locations of the tethered beads forming 
the slip-link in 2D. The initial symmetric configuration 
soon gives way to a configuration with £ ^ L on ap- 
proaching equilibrium. Figure [27] shows the development 
of this symmetry breaking as a function of the number 
of MC steps. We note, however, that the fluctuations of 
the loop sizes in the "stationary" regime appear to be 
large r in c omparison to the dilute case studied in refer- 
ence (|l53f ) , compare figure [TH] We checked that for den- 
sities (area coverage) above 40% the scaling behaviour 
becomes independent of the density. (The above simu- 
lation results correspond to a density of 55%.) The size 
distribution data is well fitted to a power law (for over 
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FIG. 26 Symmetric (£ = L/2 = 128) initial configuration 
of a 2D dense F8 (left) and equilibrium configuration (right) 
with periodic boundary conditions. The two different grey 
values correspond to the two subloops created by the slip- 
link. The slip-link itself is represented by the three (tethered) 
black dots. 



1.5 decades with 1024 monomers), and the corresponding 
exponent with 512 and 1024 monomers in figure [771 is in 
good agreement with the predicted value (j3ip . 

For our MC analysis, we again used a hard core bead- 
and-tether chain, in which self-crossings were prevented 
by keeping a maximum bead-to-bead distance of 1.38 
times the bead diameter, and a maximum step length 
of 0.15 times the bead diameter. To create the dense F8 
initial condition, a free F8 is squeezed into a quadratic 
box with hard walls. This is achieved by starting off from 
the free F8, surrounding it by a box, and turning on a 
force directed towards one of the edges. Then, the op- 
posite edge is moved towards the centre of the box, and 
so on. During these steps, the slip-link is locked, i.e., the 
chain cannot slide through it, and the two loops are of 
equal length during the entire preparation. Finally, when 
the envisaged density is reached, the hard walls are re- 
placed by periodic boundary conditions, and the slip-link 
is unlocked. After each step, the system is allowed to re- 
lax for times larger than the localisation times occurring 
at the main stage of the run. 

A similar analysis as for the dense/0-F8 structure and 
the dilute flat trefoil above, reveals the number of degrees 
of freedom for the flat dense trefoil in the form (12211) 



u;3{e,L)^LUo{L)t 



(33) 



with c = —73 — TO, where 73 = —33/8 from equation 
([5^ in the appendix {C ~ A, = 3) and m = 4 is the 
number of independent integrations over chain segments. 
Thus, c = 1/8 < 1 which implies that the dense 2D 
trefoil is delocalised. As above, we have to consider the 
various possible contractions of the flat knot. For dense 
polymers, the present scaling results show that both the 
original trefoil shape (c = 1/8 < 1, see above) and posi- 
tion II (c = 3/4 < 1) are in fact delocalised and represent 
equally the leading scaling order (cf. top part of figure 
[2^ . The F8 is only found at the third position and is 
weakly localised (c = 11/8 > 1). In an MC simulation of 
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FIG. 27 The loop size probability distribution p{£) at p = 
55% area coverage, for the F8 with 512 (top) and 1024 (bot- 
tom) monomers. The power law with with the predicted ex- 
ponent c = 1.375 in equation (|3T} is indicated by the dotted 
line. 



the dense 2D trefoil, we predict that one mainly observes 
delocalised shapes corresponding to the original trefoil 
and position II in figure [22l and further, with decreasing 
probability, the weakly localised F8 and the other shapes 
of the hierarchy (top part) in figure [^H 

These predictions ar c con sistent with the numerical 
simulations of reference (jl85( ) , who observe that the mean 
value of the second largest segment of the simulated 2D 
dense trefoil configurations grows linearly with L, and 
conjecture the same behaviour also for the other seg- 
ments, corresponding to the delocalisation of the trefoil 
obtained above. 

An analogous reasoning can be applied to the 2D tre- 
foil in the Q phase. We find that in this case that the 
leading shape is again the original (uncontracted) trefoil, 
with c = 5/7 < 1. This implies that the 2D trefoil is 
delocalised also at the Q point. All other shapes are at 
least weakly localised, and subdominant to the leading 
scaling order represented by the original trefoil. The re- 
sulting hierarchy of shapes is shown in figure !^ (bottom 
part). 



G. 3D knots defy complete analytical treatment. 

As already mentioned, 3D knots correspond to a prob- 
lem involving hard constraints that defy a closed analyt- 
ical treatment. It may be possible, however, that by a 
suitable mapping to, for instance, a field theory, an an- 
alytical description may be found. This may in fact be 
connected to the study of kn ots in diagrammatic solu- 
tions in high energy physics (j222l ). There exists a fun- 
damental relation between knots and gauge theory as 
knot projections and Fcynman graphs share the sam e 
basic ingredients corresponding to a Hopf algebra illl 
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However, up to now no such mapping has been found, 
and a theoretical description of 3D knots based on first 
principles is presently beyond hope. To obtain some in- 
sight into the statistical mechanical behaviour of knotted 
chains, one therefore has to resort to simulations stud- 
ies or experiments. In addition, a few phenomenologi- 
cal models for both the equilibrium and dynamical be- 
haviour of knots have been suggested su ch as in references 
(flill : [1691 : [ItI [1761 : [l80l : [l8lll223l: [22l . 

When discussing numerical knot studies, we already 
mention ed the Fl ory-type model brought leading to equa- 
tion ([lei; [TtI) . One may argue that the differences in 
the knot size for the different knot types corresponding 
to the same C may be included in the prefactor, that 
is independent of the chain length N. Obviously, this 
model of equal loop sizes is equivalent to a completely 
delocalised knot. This statement is in fact equivalent 
to another Flory-typc approach to knotted polymers re- 
ported in (|l7l[ ). In this model, the knot is thought of as 
an inflatable tube: for a very thin tube diameter, the tube 
is equivalent to the original knot conformation; inflating 
the tube more and more will increasingly smoothen out 
the shape until a maximally inflated state is reached. The 
knot is then characterised by the aspect ratio 

p = ^ , therefore 1 <p< N, (34) 

between length L and maximum tube diameter D. It 
appears that p is a (weak) knot invariant, and can be 
used to characterise the gyration radius of the knot. It is 
clear that, by construction, the aspect ratio described a 
totally delocalised knot, and indeed it turns out that in 
good solvent, the gyration radius shows the dependence 
Rg ~ AN^^^T-^/^^p~'^^^^^ , where r is the (dimensionless) 
deviation from the 8 temperature (|l7ll ). Obviously, the 
aspect ratio appears to be proportional to the number 
of essential crossings in comparison to expression (P^ . 
We not e tha t similar considerations are employed in ref- 
erence (|224l ). including a comparison to the entropy of a 
tight knot, finding comparable entropic likelihood. The 
mo delli ng based on the aspect ratio p is further refined 
in (fl4ll) . 

Knot localisation is a subtle interplay between the de- 
grees of freedom of one big loop, and the internal degrees 
of freedom of the various segments in the knot region. 
Under localisation, the number of degrees of freedom 

Lu ~ fi^N^^'^" (35) 

includes an additional factor N from the knot region en- 
circling the big loop. For flat knots, the competition 
between the single big loop and the knot region is in- 
deed won by the big loop. In the case of 3D knots, this 
balance is presently not resolved for knots of all complex- 
ity. Probably only detailed simulations studies of higher 
order knots will make it possible to decide for the var- 
ious models of 3D knots. Major contributions are also 
expected from single molecule experiments, for instance, 
from force-extension measurements along the lines of the 



simulations study in l|l78( ) , the advantage of experiments 
being the fact that it should be possible to go towards 
rather high chain lengths that are inaccessible in simula- 
tions. To overcome similar difficulties in the context of 
the entropic elasticity for rubber networks, Ball, Doi, Ed- 
wards and co work e rs re p lace d per manent entanglements 
by slip-links (|225l : l226l : 12271: |228[ ). Gaussian networks 
containing slip-links have been successful in the predic- 
tion of important physical quantities of rubber networks 
(|l72f ). and they have be en u sed to study a small num- 
ber of entangled chains (|229t ). In a similar fashion, one 
may investigate the statistical behaviour of single poly- 
mer chains in which a fixed topology is created by a 
number of slip-links. Such 'paraknots' can be st udie d 
analytically using the Duplantier scaling results (|l53i ). 
As mentioned previously, knowledge of the statistical be- 
haviour of paraknots can be used to create a knot scale 
for calibrating the degrees of freedom of real knots, and 
therefore also important to understand or design indirect 
experiments on kno t entropy, such as by force-extension 
measurements (Il79[ ). Paraknots may also be u seful in th e 
design of entropy-based functional molecules (|230l : l23lh . 

V. DNA BREATHING: LOCAL DENATURATION ZONES 
AND BIOLOGICAL IMPLICATIONS 

" A most remarkable physical feature of the DNA helix, 
and one that is crucial to its functions in replication and 
transcription, is the ease with which its component chains 
can come apart and rejoin. Many techniques have been 
used to measure this melting and reannealing behaviour. 
Nevertheless, important questions remain about the ki- 
netics and thermodynamics of denaturation and renatu- 
ration and how these processes are influenced by other 
molecules in the test tube and cell" (0). This remark- 
able quotation, despite 30 years old, still summarises the 
challenge of understanding local and global denaturation 
of DNA, in particular, its dynamics. In this section, we 
report recent findings on the spontaneous formation of 
intermittent denaturation zones within an intact DNA 
double helix. Such denaturation bubbles fluctuate in size 
by (random) motion of the zipper forks relative to each 
other. The opening and subsequent closing of DNA bub- 
bles is often called DNA breathing. 

A. Physiological background of DNA denaturation 

The Watson-Crick double-helix is the thermodynami- 
cally stable configuration of a DNA molecule under physi- 
ological conditions (normal salt and room/body temper- 
ature). This stability is effected (a) by Watson-Crick 
H-bonding, that is essential for the specificity of base- 
pairing, i.e., for the key-lock principle according to which 
the nucleotide Adenine exclusively binds to Thymine, 
and Guanine only to Cytosine. Base-pairing therefore 
guarantees the high level of fidelity during replication and 
transcription, (b) The second contribution to DNA-hclix 
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FIG. 28 Fraction 6h of double-helical domains within the 
DNA as a function of temperature. Schematic representa- 
tion of 9h{T), showing the increased formation of bubbles 
and unzipping from the ends, until full denaturation has been 
reached. 



stability comes from base-stacking between neighbouring 
bps: through hydrophobic interactions between the pla- 
nar aromatic bases, that overlap geometrically and elec- 
tronically, the bp stacking stabilises the helical structure 
against the repulsive electrostatic force between the neg- 
atively charged phosphate groups located at the outside 
of the DNA double-strand. While hydrogen bonds con- 
tribute only little to the helix stab ility, the major support 
comes from base-stacking (0; l233l ) . 

The quoted ease with which its component chains can 
come apart and rejoin, without damaging the chemical 
structure of the two single-strands, is crucial to many 
physiological processes such as replication via the pro- 
teins DNA helicase and polymerase, and transcription 
through RNA polymerase. During these processes, the 
proteins unzip a certain region of the double-strand, to 
obtain access to the genetic information stored in the 
bases in the core of the double-helix ([3; 0; l232h . This 
unzipping corresponds to breaking the hydrogen bonds 
between the bps. Classically, the so-called melting and 
reannealing behaviour of DNA has been studied in solu- 
tion in vitro by increasing the temperature, or by titra- 
tion with acid or alkali. During thermal melting, the 
stability of the DNA duplex is related to the content of 
triple- hydrogen-bonded G-C bps: the larger the fraction 
of G-C pairs, the higher the required melting tempera- 
ture or pH value. Thus, under thermal melting, dsDNA 
starts to unwind in regions rich in A-T bps, and then 
pro ceed s to regions of progressively higher G-C content 
l233h . Conversely, molten, complementary chains of 
single-stranded DNA (ssDNA) begin to reassociatc and 
eventually reform the original double-helix under incu- 
bation at roughly 25° below the melting temperature 
(0) . The relative amount of molten DNA in a solution can 
be measured by UV spectroscopy, revealing large changes 
in absorption in the presence of perturbed base-stacking 
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FIG. 29 Overstretching of double-stranded DNA. The black 
curve shows the typical force-extension behaviour of DNA 
following the rapid worm-like chain increase until at around 65 
pN a plateau is reached. Crossing of the plateau corresponds 
to progressive mechanical denaturation. See text for details. 
Figure courtesy Mark C. Williams. 



(j234|). Careful melting studies allow one to obtain accu- 
rate values for the stacking energies of the various combi- 
nations of neighbouring bps, a basis for detailed thermo- 
dynami c modellin g of DNA-mclting and DNA-structure 
per se (j235l : |236| ). In fact, thermal melting data have 
been successfully used to identify coding s equences of th e 
genome due to the different G-C content lj237l : l238t l239f ). 

Complementary to thermal or pH induced denatura- 
tion, dsDNA can be driven toward denaturation mechan- 
ically, by applying a teiisional stress along the DNA in 
an optical tweezer trap (j24d ). As shown in figure [29l the 
force per extension increases in worm-like chain fashion, 
until a plateau at approximately 65 pN is reached. This 
plateau is sometim es in terpreted as new DNA configu- 
ration, the S form (|24lt ). By a series of experiments, it 
appears more likely that the platea u cor responds to the 
mechanical denaturation transition (j242l ). To first order, 
the effect of the longitudinal pulling translates into an 
external torque T, whose effect is a decrease in the free 
energy for melting a bp: 

AGf = AGf=o - Ifilo, (36) 

where Oq = 27r/1 0.35 is the twist angle per bp of the 
double helix (|243l ). 

An important application of thermal DNA melting is 
the Polymerase Chain Reaction (PGR). In PGR, dsDNA 
is melted at elevated temperatures into two strands of 
ssDNA. By lowering the temperature in a solution of in- 
variable primers and single nucleotides, each ssDNA is 
complet ed to dsD NA by the key-lock principle of base- 
pairing (|244l : |245[ ). By many such cycles, of the order of 
10^ copies of the original DNA can be produced within 
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the range of hours. Again, the error rate due to the 
underlying biochemistry can be considered neghgible for 
most purposes. In particular, from the viewpoint of poly- 
mer physics/chemistry, the obtained sample is monodis- 
perse and free of parasitic reactions, creating (almost) 
ideal samples for physical studies, in particular, as any 
designed sequence of bases can be custom-made in mod- 
ern molecular biology labs Q). 

While the double-helix is the thermodynamically sta- 
ble configuration of the DNA molecule below T„i (at non- 
denaturing pH), even at physiological conditions there 
exist local denaturation zones, so-called DNA-bub bles , 
pred ominantly in A-T-rich regions of the genome (j234l : 
I262I ). Driven by ambient thermal fluctuations, a DNA- 
bubble is a dynamical entity whose size varies by ther- 
mally activated zipping and unzipping of successive bps 
at the two forks where the ssDNA-bubble is bordered by 
the dsDNA-helix. This incessant zipping and unzipping 
leads to a random walk in the bubble-size coordinate, and 
to a finite lifetime of DNA-bubbles under non-melting 
conditions, as eventually the bubble cl oses due to the en- 
ergetic preference for the closed state (12341: 12621) . DNA- 
breathing typically opens up a few bps (|246l : |247[ ) . It has 
been demonstrated recently that by fiuorescence correla- 
tion methods the fiuctuations of DNA-bubbles can be ex- 
plored on the single molecule level, revealing a multistate 
kinetics that corresponds to the picture of successive zip- 
ping and unzipping of single bps.^^ At room tempera- 
ture, the characteristic closing time of an unbounded bp 
was found to be in the range 10 to 100 /isec corresponding 
to an overall bubble lifetime in the range of a few msec 
(|249f ). The multistate nature of the DNA-breathing was 
confirmed by a UV- light absorption study l|250l ). The 
zipping d ynamics of DN A is also investigated by NMR 
methods (|25ll : |252| : l253t ). revealing considerably shorter 
time scales than the fluorescence experiments. An in- 
teresting finding from NMR studies is the dramatically 
different denaturation dynamics in B' DN A, where more 
than three AT bps occur in a row (j254i ). It is conceiv- 
able that fluorescence correlation and NMR probe differ- 
ent levels of the denaturation dynamics. Our analysis of 
the single DNA fluorescence data reported below demon- 
strates that, albeit the much longer time scale, the de- 
pendence of the measured autocorrelation function on the 
stacking along the sequence is very sensitive, and agrees 
well with the quantitative behaviour predicted from the 
stability data. 

The presence of fluctuating DNA-bubbles is essential 



Most proteins denature at temperatures between 40 to 60° C, in- 
cluding polymerases. In early PGR protocols, after each heating 
step new polymerase had to be washed into the reaction cham- 
ber. Modern protocols make use of heat-resistant polymerases 
that survive the temperatures necessary in melting. Such heat- 
resistant proteins occur, for instance, in bacteria dwelling near 

undersea thermal vents. 

Essentially, the zipper model advocated by Kittel ||248|) . 



to the understanding of the binding of single-stranded 
DNA binding proteins (SSBs) that selectively bind to ss- 
DNA, and that play important roles in replication, re- 
combination and repair of DNA (0). One of the key 
tasks of SSBs is to prevent the formation of secondary 
structure in ssDNA (P; [3). From the thermodynami- 
cal point of view one would therefore expect SSBs to 
be of an effe ctive ly helix-destabilising nature, and thus 
to lower Tm (|255l ). However, it was found that neither 
the gp32 prote in from the T4 phage nor E.coli SSBs do 
(|255l ; l25d |257[ ). An explanation to this apparent para- 
dox was suggested to consist in a kinetic block, i.e., a ki- 
netic regulation such that the rate constant for the bind- 
i ng of SS Bs is smaller than the one for bubble closing 
(j257l : \25§t} . This hypothesis could recently be verifled in 
extensive single molecule setups using mechanical over- 
stretching of dsDNA b y optical twee zers in the presence 
of T4 gene 32 protein |25l [26OH26TI). as detailed below. 



B. The Poland-Scheraga model of DNA melting 

The most widely used approach to DNA melting in 
bioinformatics is the statistical, Ising model-like Poland- 
Scheraga model (sometimes als o referred to a s Bragg- 
Zimm model) and i ts variations (j234l : |262| : |263| ) ; see also 
([i5i;[i5i;[26i;[26i). It defines the partition function ^ 
of a DNA molecule in a grand canonical picture with ar- 
bitrary many bubbles. For simplicity, we will restrict the 
following discussion to a single bubble. Below the melt- 
ing temperature T„i, the one bubble picture is a good 
approximation: due to the high energy cost of bubble ini- 
tiation, the distance between bubbles on a DNA molecule 
is large, and bubbles behave statistically independently. 
In typical experimental setups for measuring the bub- 
ble dynamics (see below), the used DNA construct is 
actually designed to host an individual bubble. For a 
homopolymer stretch of double-stranded DNA with 400 
bps, figure [3D] shows the probabilities to find zero, one 
or, or two bubbles as a function of the Boltzmann fac- 
tor u = exp(AG/i?T) for denaturation of a single bp.^^ 
Even at the denaturation transition AG = 0, it is quite 
unlikely to find two bubbles simultaneously. 

The free energy AG to break an individual bp are 
constructed as follows. We mention two different ap- 
proaches. Common for both is the Poland-Scheraga con- 
struction of the partition function. We start with the 
case that a linear DNA molecule denatures from one of its 



In biochemistry, energies are usually measured in calories per 
mol. Instead of the Boltzmann factor (3 = l/fc^ T commonly used 
in physics and engineering, it is therefore convenient to replace 
the Boltzmann constant ks by the gas constant R = ksNA, 
where Nj^ is the Avogadro-Loschmidt number. 



29 




the free energy ()24 



FIG. 30 Probability of having 0, 1, or 2 bubbles as a function 
of u for a DNA region of chain length 400 bps. The coop- 



erativity parameter was ao 
exponent c=1.76 (see text). 



10 and the loop correction 



ends. The corresponding partition function is (|234l : |262 



(37) 



where m is the number of broken bps, and AGx^x+i is 
the stacking free energy for disruption of the bp at posi- 
tion X measured from the end of the DNA. The notation 
explicitly refers to the stacking between the bp at x and 
X + 1. The first closed bp is located at x = to + 1. For 
a homonucleotide, AGx^x+i = AG, while for a given se- 
quence of bps, there come into play the different stacking 
energies for the possible combinations of pairs of bps in 
sequence^^ The stacking energies AGx,x+i have the fol- 
lowing contributions. 

The more traditional way to determine the stacking 
interactions is by fit of bulk melting curves of DNA con- 
structs containing exclusively pairs of the specific bp- 
bp combination such as (AT/TA)„ (see, e.g., (|233l : l267t ) 
and references therein). The free energy used in this 
auto mated fit procedure using the MELTSIM algorithm 
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combines the stacking enthalpy difference AH^'^^i for 
both hydrogen bond and actual stacking energies, and 
the entropy difference ASx,x+i chosen to explicitly de- 
pend on the nature of the broken bp. A recent alterna- 
tive to determine the stability parameters of DNA was 
developed in the group of Frank-Kamcnetskii, leading to 
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where the Gibbs free energies AG^x+i AG™ mea- 
sure the stacking of bps x and x + 1 and the hydrogen 
bonding of bp x including the entropy release on disrup- 
tion. Note that AG™ is chosen such that it only depends 
on the broken bp and has two values for AT and GC bps, 
irrespective of the orientation (3' or 5'). The stacking free 
energies AGSt were determined from denaturation at a 
DNA nick and show a pro noun ced asymmetry between 
AT/TA and TA/AT bonds (|247l ). For an end-denaturing 
DNA both descriptions are equivalent (though somewhat 
different when one puts numbers), as the breaking of each 
bp involves the disruption of one hydrogen bonds of bp 
X and one stacking with its neighbour. 

The difference between the two approaches becomes 
apparent when we consider the initiation of a bubble, i.e., 
a denatured coil enclosed by intact double-helix. Now, 
the partition function for a bubble with left fork position 
at XL and consisting of to broken bps, 

XL+m 

5^mid(xL,TO) =AS(to) n eAG... + i/i?T^ (^q) 



differs from ([57]) in three respects: (i) While the bub- 
ble consists of m molten bps, to -I- 1 stacking interactions 
need to be broken to create two boundaries between in- 
tact double-strand and the single-strand in the bubble; 
the extra stacking interaction is effectively incorporated 
into A. (ii) The polymeric nature of the fiexible single- 
stranded bubbles involves the entropy loss factor I](m) = 
1/(to 4- DY with critical exponent c an d th e par a mete r 
D to take care of finite size effects^^ (|235l : |264| ; l266f ): 
(iii) the factor A: In the stan dard notation , A = cto = 
cxyj^FjRT) -10''^ - 10"^ dUi; HH [HI [Ml) , while 
according to (|247l ). A = ^ ~ 10"'^ is called the ring factor. 
Interestingly, the cooperativity parameter ao is of the or- 
der of what corresponds to the singular unbalanced stack- 
ing enthalpy for breaking the first bp to initiate the bub- 
ble. The new stability data lead to a more pronounced 
asymmetry in opening probabilities between differen t bp- 
bp combinations. The analysis in refere nces (|269l ; l270t ) 
demonstrates that the parameters from (j247t ) appear to 
support better the biological relevance of the TATA mo- 
tif^* in natural sequences, that is, show a more pro- 
nounced simultaneous opening probability for the TATA 
motif. 

As demonstrated for the autocorrelation function mea- 
suring the breathing dynamics in figure [321 the descrip- 
tion in terms of the partition function 3f based on the 



I.e., an AT bp followed by another AT as different from an AT 
followed by a TA, etc. 
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28 The four bp 



1 is chosen. 
TATA . 

AT AT- sequence is one of the typical codes marking 
where RNA polymerase starts the transcription process 
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stability parameters from (j247l ) re produces t he experi- 
mental data well. The analysis in (|269l : l270f ) also indi- 
cates that the accuracy of the model predictions for the 
bubble dynamics is rather sensitive to the parameters. It 
is therefore conceivable that improved fluorescence mea- 
surement of the bubble dynamics may be employed to ob- 
tain accurate DNA, stability parameters, complementing 
the more traditional melting, NMR, and gel electrophore- 
sis bulk methods. It should be noted that the dynamics is 
strongly influenced by local deviations from the B config- 
uration of the DNA double helix. Thus, in local stretches 
of more than three AT bps in sequence, the B' structure 
is assu med, l eading to pronouncedly different zipping dy- 
namics (|254h . 

Two major questions remain in the thermodynamic 
formulation of DNA denaturation and its dynamics. 
Namely, the exact origin of the bubble init iatio n factor 
(To (or, alternatively, the ring factor from (j247l )). and a 
method to properly calibrate the zipping rate k. The fac- 
tor ctq is related to the entropic imbalance on opening the 
first bp of a bubble: While this requires the breaking of 
two stacking interactions, only one bp has access to an in- 
creased amount of degrees of freedom. Still, these degrees 
of freedom must be influenced by the fact that the single 
open bp is coupled to two zipper forks. Currently, co re- 
mains a fit parameter. The exact value of the zipping rate 
k remains open. While NMR experiments indicate much 
faster rates in the nanosecond range (~ 10^ sec~^), the 
fluorescence correlation measurements produce values in 
the microsecond range (~ lO'* — 10^ sec~^). This large 
discrepancy may be based on the fact that both meth- 
ods have different sensitivity to the amplitude of intra-bp 
separation. Cur rent l y, fc is taken as a fit parameter. In 
the analys is in (|269l : 12701 ) . we use the stacking parame- 
ters from (j247l ) including the value of the ring factor ^, 
so that k (a shift along the logarithmic abscissa) is the 
only adjustable parameter. 

It has been under debate what exact value should be 
taken for the critical exponent c entering in the entropy 
loss factor for a denaturation bubble. This is connected 
to the fact that c > 2 would imply a first order denat- 
uration transition on melting, whil e 1 < c < 2 would 
stand for a second order transition (j217l : l264l ) . Specula- 
tions about a possible first order transition are connected 
to the rather distinct spikes in the differential melting 
curves (j234f ).^^ Theoretical polymer physics approaches 
to explain a c > 2 are either based on the inclusion of 
polymeric self-avoidance int eract ions of the bubble with 
the remainder o f the chain (|l55[ ): or built on a directed 
polymer model (|27lh . Despite the elegance of both ap- 
proaches, it is an open question how truly they repre- 
sent the detailed denaturation behaviour of real DNA 



Due to the rather small DNA samples used in melting experi- 
ments (5000 bp and less (12341) ), claims about the order of the 
underlying thermodynamical phase transition should bo consid- 
ered with some care. 




FIG. 31 Clamped DNA domain with internal bps a; = 1 to M, 
statistical weights ithb(a;), Ust{x), and tag position xt- The 
DNA sequence enters through the statistical weights Ust{x) 
and Uhb(a^) for disrupting stacking and hydrogen bonds re- 
spectively. The bubble breathing process consists of the ini- 
tiation of a bubble and th e sub sequent motion of the forks at 
positions xl and xr. See (|270t ) for details. 

([27^ : [273h . Applying the MELTSIM algorithm to typical 
sequences, it was found that there is a connection be- 
tween the fit result for the cooperativity parameter cfq, 
whose value is reduced from ~ 10~'^ to w 10~^ by as- 
suming c = 2.12 instead of 1.76 (|236t ). Below the melting 
transition, the typical bubble size is only a few bps, and in 
that regime the polymeric treatment of the loop entropy 
loss is of approximative nature. Indeed, in the analysis of 
(j247t ) no entropy loss due to polymer ring formation was 
included. For the breathing dynamics, we include c, to 
cover higher temperatures with somewhat larger bubbles, 
but find no significant change in the behaviour between 
c < 2 and c > 2, as long as the exponent is sufficiently 
close to 2. Wc therefore use the value c = 1.76, that is 
consistent with the traditional data fits employed in the 
determination of the stacking parameters. 



C. Fluctuation dynamics of DNA bubbles: DNA breathing 

Below the melting temperature Tm, DNA bubbles are 
intermittent, i.e., they form spontaneously due to ther- 
mal fluctuations, and after some time close again. DNA- 
brcathing can be thought of as a biased random walk in 
the phase space spanned by the bubble size m and its 
posi tion denoted, e.g., by the left zipper fork position 
XL (|269l : l270f ). The bubb le creation can be viewed as 
a nucleation process (|274[ ). whereas the bubble lifetime 
corresponds to the survival time of the first passage prob- 
lem of relaxing to the m = state after a rando m walk 
in the m > halfspacc HH [IzO; Hzl [Izi; ^7^. Bub- 
ble breathing on the single DNA-bubble level was mea- 
sured by fluorescence correlation spectroscopy in (j249l ). 
This technique employs a designed stretch of DNA, in 
which weaker AT bps form the bubble domain, that is 
clamped by stronger GC bonds. In the bubble domain, a 
fluorophore-qucncher pair is attached. Once the bubble 
is created, fluorophore and quencher are separated, and 
fluorescence occurs. A schematic of this setup is shown 
in figure [311 
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The zipper forks move stepwise Xl/b, ^ with 
rates t^^p,{x 1^/^,171). We define for bubble size decrease 

t+(xL,m) = t-(xi,,m) fc/2 (m > 2) (41) 

for the two forks. "^^ The rate k characterises a single bp 
zipping. Its independence of x corresponds to the view 
that bp closure requires the diffusional encounter of the 
two bases and bond formation; as sterically AT and GC 
bps are very similar, k should not significantly vary with 
bp stacking. The rate k is the only adjustable parameter 
of our model, and has to be determined from experiment 
or future MD si mula t ions . The factor 1/2 is introduced 
for consistency (|269l : [270t [276l : [2771 : Wm . Bubble size 
increase is controlled by 

tl{xL,rn) = kust{xL)u\-,h{xL)s{m)/2, 
t'^{xL,m) = kustixB^ + l)ui^h{xB)s{m)/2, (42) 

for m > 1, where s(m) = {(1 + m)/{2 + m)}^. Finally, 
bubble initiation and annihilation from and to the zero- 
bubble ground state, to = ^ 1 occur with rates 

^ci^L) = k£,'s{0)ust{xL + l)uhh{xL + l)ust{xL + 2) 
t^ixL) = k. (43) 

The rates t fulfil detailed balance conditions. The an- 
nihilation rate tQ{xL) is twice the zipping rate of a sin- 
gle fork, since the last open bp can close either from 
the left or right. Due to the clamping, xl > and 
< M + 1, ensured by reflecting conditions t^(0, to) = 
t~^{xL, M—xl) = 0. The rates t together with the bound- 
ary conditions fully determine the bubble dynamics. 

In the FCS experiment fluorescence occurs if the bps 
in a A-nei ghbourhood of the fluorophore position xt are 
open (j249l ). Measured fluorescence time series thus corre- 
spond to the stochastic variable I{t), that takes the value 
1 if at least all bps in [xt ^ A,xt + A] are open, else it is 
0. The time averaged (~) fluorescence autocorrelation 



At{xT,t)=I{t)I{0)-I{t) (44) 

for the sequence AT9 from (|249( ) are rescalcd in figure [32l 
We note that an alternative method to obtain precise 
DNA stability data may be provided by a DNA construct 
with two AT-rich zones between which a shorter GC-rich 
bridge is located. The first passage problem correspond- 
ing to bubble merging at temperatures between the melt- 
ing temperatures of the AT and GC zones was recently 
calculated (j279l ). and provides the framework for mod- 
ified fiuorescen ce co rrelation setups similar to the one 
from reference (|249t ). 



Due to intrachain coupling (e.g., Rouse ), lar ger bubbles may in- 
volve an additional 'hook factor' m~'^ ||276| ). 




t[1/k] 



FIG. 32 Time-dependence of the autocorrelation function 
At{xT,t) for the sequence AT9 measured in the FCS setup 
of reference |249l ) at lOOmM NaCI. The full lines show the 
result from the master equation, base d on the DNA stability 
parameters from Krueger et al. (|247t ). The inset shows the 
broadening of the relaxation time spectrum with increasing 
temperature. 

D. Probabilistic modelling — the master equation (ME) 

DNA breathing is described by the probability distri- 
bution P(.TL,TO,t) to find a bubble of siz e to l ocated at 
XL whose time evolution follows the ME (|269l : [270t [27l 
[277l : [278l) 

^P{xL,m,t)=WPixL,m,t). (45) 

The transfer matrix W incorporates the rates 
t. Detailed balance guarantees equilibration to- 
ward Imif-^oo PixLjiTT-jt) = 3f{xL,m)/^, with 
^ = Y.x^,m^i^L,m) dHO). The ME and the 
explicit co nstruction of W are discussed at length in 
references (|270l : l276l ). Eigenmode analysis and matrix 
diagonalisation produces all quantities of interest such 
as the ensemble averaged autocorrelation function 

A{xT,t)^{I{t)m)-i{I))\ (46) 

(/(i)/(0)) is proportional to the survival densi ty that th e 
bp is open at t and that it was open initially (|269l : |270| ) . 

In figure [32] the blue curve shows the predicted be- 
haviour of A{xT,t \ ca lculated for T = 49°C with the 
parameters from (12471 ). As in the experiment we as- 
sumed that fiuorophore and quencher attach to bps xt 
and St + 1, that both arc required open to produce a 
fluorescence signal. From the scaling plot, we calibrate 
the zipping rate as fc = 7.1 x 10"^ /s, i n good agreement 
with the findings from reference (|249f ). The calculated 
behaviour reproduces the data within the error bars, 
while the model prediction at T = 35°C shows more 
pronounced deviation. Potential causes are destabilising 
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effects of the fluorophore and quencher, and additional 
modes that broaden the decay of the autocorrelation. 
The latter is underlined by the fact that for lower temper- 
atures the relaxation time distribution /(r), defined by 
A(xT,t) = J exp(—t/T)f{T)dT, becomes narrower (figure 
[32] inset). Deviations may also be associated with the 
correction for diffusional motion of the DNA construct, 
measured without quenc her a nd neglecting contributions 
from internal dynamics l|28l[ ). Indeed, the black curve 
shown in figure [5^ was obtaine d by a 3% reduction of the 
diffusion time;"^^ see details in (|27Clf ). 

A remark on a prominent alternative approach to DNA 
breathing appears in or der. This is the Peyrard Bishop 
Dauxois (PBD) model (jUl; [Hi) based on the set of 
Langevin equations (j284l ) 

(fyn dV{yn) dW{yn+i,yn) dW{yn,yn^i) 

TTl : — — — 

dt"^ dyn dy„ dy„ 

-m7^=^„(t). (47) 

Here, V{yn) = Ai [exp(-a„y„) - if is a Morse po- 
tential for the hydrogen bonding, Z?„ and a„ assuming 
two different values for AT and GC bps; W{y, y') = 
I [1 -|- pexp{— -I- y')}] [y — y'Y is a nonhncar poten- 
tial to include bp-bp stacking interactions between ad- 
jacent bps y and y' . The parameters fc, p, /3, 7, and 
m are invariant of the sequence. Usually, t he st ochas- 
tic equations (|47)) is integrated numerically (|284l ). Due 
to its formulation in terms of a set of Langevin equa- 
tions, the DPB model is very appealing, and it is a useful 
model to study some generic features of DNA denatura- 
tion. The disadvantage of the current formulation of the 
DBP model is the fact that it does not include enough 
parameters to account for the known independent sta- 
bility constants of double stranded DNA (in fact, only 
two parameters are allowed to vary with the sequence, 
in contrast to the 12 independent parameters needed 
to fu lly describe the bp stacking and hydrogen bonding 
(l247f )V Moreover, there appear to be certain ambiguities 
in the proper formulation of boundary conditions in the 
stochastic integration (|286f ). and also with respect to the 
interpretation of the biological relevance and computa- 
tional limitations of the PBD model (j287( ). The master 
equati on and G i llespie approa ch brought forth in refer- 
ences (HH; [273 [Izi; [277l:l278h bridges the gap between 
the thermodynamic data for the bp stacking and hydro- 
gen bonding obtained by various experimental methods, 
and the dynamical nature of DNA breathing. It is hoped 
that both dynamic models will synergetically be devel- 
oped further and eventually lead to a better understand- 
ing of DNA dcnaturation fluctuations. 



For diffusion time tb = 150u s measured for an RNA construct 
of comparable length in ll28lf) . 



E. Stochastic modelling — the Gillespie algorithm 

Despite its mathematically simple form, the master 
equation (|45p needs to be s olved numerically by invert- 
ing the transfer matrix (|27Cll : [27611 . Moreover, it produces 
ensemble-averaged information. Given the access to sin- 
gle molecule data, it is of relevance to obtain a model 
for the fully stochastic time evolution of a single DNA- 
bubble, providing a description for pre-noise-averaged 
quantities such as the step-wise (un)zipping. With this 
scope, we introduced a stochastic simulation scheme for 
the (un)zipping dynamics, using the Gillespie algorithm 
to update the state of the system by determining (i) the 
random time between individual (un)zipping events, and 
(ii) which reac tion direction (zipping, <— , or unzipping, 
—^) will occur (|288( ). This scheme is efficient computa- 
tionally, easy to implement, and amenable to generalisa- 
tion. 

To define the model, we denote a bubble state of m 
broken bps by the occupation numbers 6„j = 1 and bm' = 
(to' 7^ to). The stochastic simulation then corresponds 
to the nearest-neighbour jump process 

60 ^ 61 ^ . . . ^ 6m ^ ■ • ■ ^ ^ 6m, (48) 

with reflecting boundary conditions at 60 and 6j\f . Each 
jump away from state bm occurs after a random time r, 
and in random direction to either bm-i or 6„i+i, governed 
by the reaction probability density function''^ (|289l : |290| ) 

P(t,p) = t^(TO)e-('^(™)+'"(™))^ (49) 

where /i G — } denotes the unzipping (-I-) or zipping 
(— ) of a bp, and the jump rates t^(TO) are defined be- 
low. From the joint probability density function (|49p . 
the waiting time probability density function that a jump 
away from hm occurs is given by ipir) ~ P{'Ti A*), i-e-, 
it is Poissonian. The probability that the bubble size 
does not change in the time interval [0, t] is given by 
(f)(t) = 1 — ijj(T)dT. The fork position xl (and thereby 
the seque nce of bps) is straightforwardly incorporated 
([269l : [270l) . 

We start the simulations from the completely zipped 
state, 60 = 1 at t = 0, and measure the bubble size at 

time t in terms of m{t) = X]m=o "^^™(^)- "^^^ time se- 
ries of m{t) for a single stochastic realization is shown 
in figure [331 It is distinct that the bubble events are 
very sharp (note the time windows of the zoom- ins), and 
most of the time the zero-bubble state 60 prevails due to 
ctq 1. Moreover, raising the temperature increases the 
bubble size and lifetime, as it should. By construction of 



The original expression for the reaction probability density func- 
tion, P{T,fl) = bmt'^(m) exp ( — T^lm.n ''"it'^C'Tl)) , ^^^^ 

vant for consideration of multi-bubble states, simplifies here due 
to the particular choice of the bm- 
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the simulation procedure, it is guaranteed that an occu- 
pation number bm — ^ ^ 0) corresponds to exactly 
one bubble. 

In a careful analysis, it was shown that the stochas- 
tic simulation method provides accurate information of 
the statistical quantities of the bubble, su ch a s opening 
probability and autocorrelation function (j288l ). It can 
therefore be used to obtain the same information as the 
master equation with the advantage of also giving access 
to the noise in the system. With the Gillespie technique, 
we also obtained the data points in the graphs in this 
section. 

F. Bacteriophage T7 promoter sequence analysis. 

An example from the analysis of the promoter sequence 

1 20 
I I 

5 ' -aTGACCAGTTGAAGGACTGGAAGTAATACGACTC , . 

AGTATAGGGACAATGCTTAAGGTCGCTCTCTAGGAg-3 ' ^ ' 
I I I 
38 41 68 

of bacteriophage T7 is shown in figure[34l(l269f). It depicts 
the time series of I{t) for the tag positions xt = 38 at the 
beginning of TATA, and xt = 41 at the first GC bp af- 
ter TATA. It is distinct how frequent bubble events are in 



TATA in comparison to the vicinal GC-ric h do main (note 
that AT/TA bps are particularly weak (|247t) ). This is 
quantified by the waiting time density iP{t), whose char- 
acteristic time scale is more than an order of magnitude 
longer for the xt = 41 position. In contrast, we observe 
almost identical behaviour for the bubble survival den- 
sity 0(t). Due to the proximity of xt = 41 to TATA, the 
typical bubble sizes for both tag positions are similar, 
and therefore the relaxation time. However, as shown in 
figure [34] bottom, the variation of the mean lifetime ob- 
tained from the master equation is quite small (within a 
factor 2) for the entire sequence. The latter graph also 
shows the insignificant variation acc ordin g to the earlier 
stability parameters by Blake et al. (|235l ). 

The results s ummarised in figure [34] and further stud- 
ies in (|269l : 12701 ) may indicate that it is not solely the in- 
erease d op ening probability at the TATA motif, as stud- 
ied in (j29ll ). Given the rather short bubble opening times 
of order of a few k~^, it might be sufficient to induce bind- 
ing of transcription enzymes (or other single stranded 
DNA binding proteins) if only bubble events are repeated 
often enough. In the present example, the waiting time 
between individual bubble events is increased by a factor 
of 25 inside the TATA motif. Guided by such results, 
detailed future studies combining optical tweezers over- 
stretching and monitoring transcription initiation may 
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FIG. 34 Time series I{t) for the T7 promoter, with a;T = 38, 
41. Middle: Waiting time ('(/'(t)) and bubble survival time 
(i/3(t)) densities. Bottom: Mean bubble survival time, A = 2. 



be a step toward better understanding of this important 
biochemical process. 

We note that the influence of noise (e.g., due to rep- 
etition of single molecule experiments) on the bubble 
dynamics can also b e stu died in the weak noise limit 
by a WKB method (j292| ). This model provides infor- 
mation, for instance, about the time it takes a DNA 
to denature under temperatures above (mathemat- 
ically corresponding to a finite time singularity). Bubble 
breathing can be mapped on the Coulomb problem of 
the Schrodinger equation, and the corresponding phase 
transition studied (j293l ). 
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G. Interaction of DNA bubbles with selectively 
single-strand binding proteins 

Let us now come back to the destabilising effect of 
single-stranded DNA binding proteins (SSBs) mentioned 
in section IV. Al In a homopolymer approach, this was 
studied in a master equation approach in references 
The quantity of interest is the joint prob- 
ability P{m, n, t) to have a bubble consisting of m bro- 
ken bps, and n SSBs bound to the two arches of the 
bubble. In addition to the rates t^ for bubble increase 
and decrease, the rates for SSB binding and unbind- 
ing are necessary to define the breathing dynamics in the 
presence of SSBs. On the statistical level, the effect of 
the SSBs becomes coupled to the motion of the zipper 
forks. Thus, the rate for bubble size decrease is propor- 
tional to the probability that no SSB is located right next 
to the corresponding zipper fork; and the rate for SSB 
binding is proportional to the probability that there is 
sufficient unoccupied space on the bubble. Binding is al- 
lowed to be asymmetric, and is related to a parking lot 
problem in the following sense. The number A of bps oc- 
cupied by a bound SSB is usually (considerably) larger 



than one. In order to be able to bind in between two al- 
ready bound SSBs, the distance between these two SSBs 
must be larger than A. The larger A the less efficient 
the SSB-binding becomes, similar to parking large cars 
on a parking lot desgined for small vehicles. Apart from 
the binding size A of the SSBs, two additional physical 
parameters come into play: the unbinding rate q of the 
SSBs, and their binding strength n = coK°^ consisting 
of the volume concentration cq of SSBs and the equilib- 
rium binding constant K'^'^ = vq exp {f3\EssB\), with the 
typical SSB volume and binding energy -Essb- 

The coupled dynamics of SSB-binding and bubble 
breathing is discussed in references (|276|; [273) ; sim ilar 
effects in end-denaturing DNA was studied in (j294l ) in 
detail. Here, we report the behaviour of the effective free 
energy landscape in the limit of fast SSB-binding in the 
sense that the dimensionless parameter j = q/k of SSB- 
unbinding and bubble zipping rates is large, 7^1. This 
limit allows one to average out the SSB-dynamics and to 
calculate an effective free energy, in which the bubble dy- 
namics with the slow variable m runs off. The result for 
two different binding strengths k is shown in figure I35[ 



along with the free energies corresponding to keeping n 
fixed. It is distinct that while for lower k the presence of 
SSBs diminishes the slope of the effective free energy, for 
larger k the slope actually becomes negative. In the first 
case, that is, the bubble opening is more likely, but still 
globally unfavourable. In the latter case, the presence of 
SSBs indeed leads to full denaturation. One observes dis- 
tinct finite size effects due to A > 1: only when the bubble 
reaches a minimal size m > A, SSB-binding may occur, a 
second SSB is allowed to bind to the same arch only once 
m > 2A, etc. This effect also produces the nucleation bar- 
rier for full denaturation in the lower plot of figure 1351 
Similar finite size effects were investigat ed for biopoly- 
mer translocation in references (:295l: l296h . We note that 
the transition to denaturation could also be achieved by 
reaching a smaller positive slope of the effective free en- 
ergy in the presence of SSBs, and additional titration or 
change of the effective temperature through actual tem- 
perature change or mechanical stretchin g as performed in 
the experiments reported in references (|259l ; 126(1 l26lf ). 

VI. ROLE OF DNA CONFORMATIONS IN GENE 
REGULATION 

Our current understanding of gene regulation to large 
extent is based on the experiments by An dre Lwoff at 
Institut Pasteur more than 50 years ago (|297t ). Lwoff 
and his collaborators discovered that while a strain of 
E.coli, a common intestinal bacterium, divided regularly 
when undisturbed, an unexpected phenomenon occurred 
when the strain was exposed to UV light: the bacte- 
ria stop growing and after some 90 minutes they burst 
(lyse), releasing a load of viruses. These viruses then in- 
vade new E.coli.^^ Some of the newly infected bacteria 
immediately lyse again, while the rest divides normally — 
while carrying the virus in them. This dormant state 
(lysogeny) of the bacterium can then be driven toward 
lysis by renewed UV exposure. 

The UV exposure-induced transition from lysogeny to 
lysis occurs as sketched in figure [351 On infection, the 
bacteriophage A injects its DNA into E.coli . In the lyso- 
genic pathway, the viral DNA is integrated into the host 
DNA. During lysogeny, repressor dimers bind to certain 
operator sites on the A part of the DNA, recruiting RNA 
polymerase to bind to the overlapping promoter region(s) 
and blocking of the vicinal promoter for the divergent cro 
gene. RNA polymerase then transcribes the cl gene to 
the left of the operator, leading to the expression of new 
repressor molecules. UV light, however, leads to cleavage 
of the repressor dimers. '^'^ Now the basal transcription of 
the gene cro, opposite to the cl gene with respect to 
the operator region, leads to the expression of the Cro 



Often these viruses are called phages or bacteriophages — bacteria 
eaters. 

By activation of RecA proteins. 
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FIG. 36 Gene regulation, here the example of the (diver- 
gent) bacteriophage A switch after infection of E.coU . Figure 
from (fl^ . with permission from M. Ptashne. This figure was 
modified by the author from the corresponding figure in M. 
Ptashne, A Genetic Switch: Phage Lambda and Higher Or- 
ganisms, 2nd edition ©Blackwell Science, Maiden, MA and 
Cell Press, Cambridge, MA, with permission. 

protein. Cro bound to the operator then recruits RNA 
polymerase to the operator, stabilising the Cro produc- 
tion and blocking cl. Simultaneously, a whole sequence of 
genes is being expressed, and the virus reproduces itself 
inside E.coli until lysis occurs. UV light flips the switch 
from transcription of the gene cl maintaining the dor- 
mant lysogenic pathway, induci ng lysis that is fostered 
by transcription of the cro gene (|l6l : |298( ) . 

The activity of a gene can be monitored even on the 
single genome level, by combining the targeted gene gl 
with the gene leading to synthesis of GFP, the green flu- 
orescent protein, i.e., when gl is transcribed, then so is 
the gene for GFP. Occurrence of fluorescence then reports 
transcription of gl. Connected to questions such as the 
stability of a genetic pathway is the search process of a 
specific gene by regulatory proteins, that is, how dynam- 
ically the binding protein actually locates the operator 
on the genome. We address these points in what follows. 



A. Physiological background of gene regulation and 
expression 

The A switch from figure [36] is an example of a rel- 
atively simple mechanism. Even simpler is the well- 
studied Lac repressor. There, the lacZ gene is expressed 
by recruitment through the CAP protein when E.coli is 
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starved of glucose and exposed to lactose. This enables 
E.coli to digest lactose. In absence of lactose, lacZ is 
blocked by the rep protein. In general, the expression of 
a certain gene is just one element in a cascade of simulta- 
neous and/or hierarchical control units, such as in the de- 
velopmental regulatory network of the sea urchin embryo 
(|299t ). The basic physiological background is common to 
all of them: 

Genes are the blueprints of proteins. They control 
physiological processes but also developmental pathways: 
from a fertilised egg cell, eventually all cell types (skin, 
hair, liver, brain, etc cells) of a human body emerge, or 
a skin cell changes colour on sunlight exposure. A gene 
is but a stretch of a DNA molecule, typically comprising 
some 200-1000 bps. Roughly speaking, a gene is on when 
it is being transcribed by RNA polymerase, otherwise it 
is off. RNA polymerase binds at the promoter region 
consisting of some 60 bps close to the beginning of the 
gene. It then converts the A, T, G, C code of the gene 
into a complementary messenger RNA (from which, in 
turn, the protein is produced during translation). The 
stop of transcription is triggered by a certain sequence at 
the end of the gene. Depending on specific conditions of 
the recruitment by regulatory proteins, RNA polymerase 
binding to the promoter of a certain gene is either blocked 
(the gene is off), facilitated (high binding affinity of RNA 
polymerase due to the (simultaneous) presence of certain 
protein(s)), or basal (in absence of any bound regulatory 
protein, RNA polymerase can still have a minor affinity 
to the promoter and then autonomously start transcrip- 
tion). The transcription mechanism is part and parcel 
of the central dogma of molecular biology summarised in 
figure [TJ 

Molecular switches such as the A switch are surpris- 
ingly stable against noise, despite the fact that there are 
only about 100 repressor dimers in the entire bacteria cell 
(jSOOj). Thus, apart from external induction, lysis occurs 
by sponta neou s indu ction due to absence of CI from the 
operators (jSOll : l302h . Such noise- induced errorsare esti- 
mated to occur once in 10'' cell generations (|303l : I304D . 
The stability of the A switch against nois e wa. s analysed 
in terms of a Wentz ell-F reidlin approach (|305[ ) and by a 
simulation analysis (|306l ). The latter confirmed that the 
currently known molecular mechanisms used in modelling 
the A switch appear sufficient. While the classical Shea- 
Ackers model based on a statistical mechanical app roac h 
(|307l ) is well established and studied numerically (|308f ). 
it relies on the knowledge of 13 fundamental Gibbs free 
binding energies composed to 40 different binding states 
of regulatory proteins and RNA polymerase at the two 
promoters of A. Simulation of the complete A regulatory 
system pro ved the understanding of the mechanisms of 
the switch (|308l ) . Two more recent studies show that the 
A switch remains stable even when each of the fundamen- 
tal Gibbs free energies is varied within its (appreciable) 
experimental error. Moreover, effects of potential muta- 
tions resulting in more significant changes of the bind- 
ing energies were studied, and it was shown that cer- 
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FIG. 37 Activity of the two A promoters as function of re- 
pressor concentration for vanishing Cro concentration. The 
full line corresponds to wild type data, whereas the dashed 
lines correspond to "mutations". The thin vertical li ne co rre- 
sponds to lysogenic CI concentration. See reference (|309l ) for 
details. 

tain mutations can even be compensated by parallel mu- 
tations in fluencing other binding energies (suppressors) 
()309l : |310| ) . A typical result is shown in figure [571 

B. Binding proteins: Specific and nonspecific binding 
modes 

Given their very specific function. DNA-binding pro- 
teins must recognise a specific (cognate) sequence of nu- 
cleotides along the genome. In fact, without opening the 
double helix, the outside of the DNA can be read by pro- 
teins, as the edge of each bp is exposed at the surface. 
These patterns are unique only in the major groove of the 
DNA, this being the reason why gene regulatory proteins 
generally bind to the major groove. Apart from single 
bp pattern recognition, the protein binding is sensitive 
to the special surface features of a certain DNA region. 
This local structure of DNA needs to be complementary 
to the protein structure. Typical structure patterns (mo- 
tifs) include helix-turn-helix, zinc fingers, leucine zipper, 
and helix- loop-helix motifs (|l|) . In bacteria, typical DNA- 
binding proteins cover some 20 bps or less. For instance, 
the lac repressor has a cognate sequence of 21 base pairs, 
the CAP protein 16, and the A repressor cl 17 bps ([l|). 
Although the interaction with a single nucleotide within 
such a DNA-protein bond is relatively weak, the sum of 
all matching nucleotides reaches appreciable values for 
the overall binding enthalpy, see below. Moreover, reg- 
ulatory proteins bound simultaneously can significantly 
enhance the stability of their individual bonds. 

A simple model for the binding interaction goes back 
to the work of Berg and von Hippel (|31ll : l313f ). Ac- 
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cordingly, the binding free energy is comprised of two 
contributions: (i) the (average) non-specific binding free 
energy due to electrostatic interaction with the DNA; 
and (ii) additional binding free energy if the sequence of 
the binding site is sufficiently close to the best (perfectly 
matching) sequence. The transition to the non-specific 
binding is supposed to occur via a conformational change 
of the regulatory protein from one that allows more hy- 
drogen bond-formation to another that permits closer 
contact between the positive charges of the binding pro- 
tein to the negatively charged DNA backbone (|312f ). This 
is supported by more recent structural studies: While in 
the non-specific binding mode the Lac repressor is bound 
to DNA in a rather loose and fuzzy way (j314l ). it ap- 
pears much more ordered in the specific mode. In f act, i n 
the latter the protein induces a bend in the DNA (j315l ). 
In case (ii), the additional binding free energy can, to 
good approximati on, b e considered independent and ad- 
ditive, reference (|316f ) provides a review of these issues, 
and derives the following result. Accordingly, to satisfy 
both the thermodynamic and kinetic constraints of the 
DNA-binding protein interaction, each additional base 
mismatch in comparison to the best sequence amounts 
to the loss of roughly 2 ksT, and the optimal value for 
the transition between best specific binding to the cog- 
nate site and non-specific binding is shown to be some 
16 ksT below the energy of the best binding. This value 
is quite close to the w 14 fcsT found for t he differen ce 
between specific and nonspecific binding in (j317l : l318l ) . 

The fact that regulatory proteins bind with varying 
affinity is an important ingredient in gene regulation: 
Not all promoters should have the same activity, because 
some proteins are required by the cell at much higher 
levels than others. Thus, one given regulatory protein, 
that controls the recruitment to the promoters of several 
genes, can act with different strength depending on the 
degree of matching with the local sequence. 

Non-specific binding can be com e quite appreciable. It 
was discovered in reference (|319( ) that in the case of 
the Lac repressor less than 10% of the proteins were 
unbound. In a more recent study using in vivo data 
of the A switch, it was found that in a lysogen nearly 
90% of the repressor protein cl is non-specifically boimd. 
This implies that only 10-20 free cl dimers exist in the 
E.coli cell at any time, pointing at the important role 
of non-specific binding in the search process of the cog- 
nate site addressed in the following subsection. Under 
different conditions, both cl and Cro are always non- 
specifically bound by more than 50%. The corresponding 
non- s peci fic binding energies were estimated as 7 ksT 
(13171 : [3181) . 

We note that in contrast to regulatory proteins, restric- 
tion enzymes have an approximate all-or-nothing match- 
ing condition: If a defined sequence matches the restric- 
tion enzyme, it will cut, otherwise not. Even a single 
mismatch reduces the action of the restriction enzyme 
by orders of magnitude. This distinction from regulatory 
protein makes sense as restriction enzymes are survival 




FIG. 38 Schematic of the search mechanisms in equation (|52p . 

mech anisms and should not just cut the cell's own DNA 
(j320f ). This does not mean that restriction enzymes do 
not bind non-specifically — in fact, this is an important 
ingredient of their search process in total analogy to reg- 
ulatory proteins. However, their sole active role occurs 
on complete matching. 

C. The search process for the specific target sequence 

To find their specific (cognate) binding site along the 
genome, DNA-binding proteins such as restriction en- 
zymes or transcription factors have to search megabases 
along the DNA molecule. The high accuracy of gene 
expression control by binding proteins such as in the A- 
switch requires a fast search and recognition of the target 
sequence by the proteins. A simple 3-dimensional (3D) 
search of the target sequence by the proteins is not suf- 
ficient to explain experimentally measured target search 
rates. It has been suggested relatively early (|321j : ,323 ) 
that additional search mechanisms such as ID sliding 
along the genome are needed to account for the actual 
efficiency of the search process. In their pioneering work, 
Berg, von Hippel and coworkers established a statistical 
model for target search comprising the four fundamental 
steps, as shown in figure [351 (i) 3D macrohops during 
which the protein fully detaches from the genome un- 
til after a volume excursion it rebinds to the DNA (as 
a good approximation, the landing site on the DNA af- 
ter a macrohop can be assumed to be equidistributcd 
and uncorrclatcd) ; (ii) microhops during which the pro- 
tein detaches from the DNA but always stays very close 
to it (i.e., the microhop takes place within a cylinder 
whose radius corresponds t o the escape distance of the 
protein from the DNA, see (|31l[ )l: (iii) ID sliding along 
the genome (while preserving a certain bonding to the 
DNA due to nonspecific binding); and (iv) intersegmen- 
tal jumps. The latter are mediated by DNA-loops bring- 
ing two chemically remote segments of the DNA close 



38 



in Euclidean space, see, for instance, (|20l ) and references 
therein. A protein like Lac repressor, which can estab- 
lish bonds to two different stretches of dsDNA simulta- 
neously, can then jump from one to the neighbouring 
segment. This p roce s s mi ght lead to a paradoxical dif- 
fusion behaviour (j323l : l324l ). However, if the conforma- 
tional changes in the DNA arc not too slow, both the 
bulk mediated macrohops and the intersegmental trans- 
fer lead to fast mixing of the enzymes' positions along 
the c hain (as it was shown for the related problem in 
(HH)), and on the mean- field level can be described by a 
desorption followed by the absorption at a random place. 

Recently, there has been renewed interest in the ta rget- 
ing probl e m, b oth theoretically (see, for ins tanc e . (13161 : 
l326l : [327I : \3Wi ) and experimentally (e. g.. (l329l: |330D ). 
including single molecule studies (|260l : l26ll : l33lf ). De- 
spite the extensive knowledge of specific binding rates 
and both specific and non-specific binding free energies, 
the precise relative contributions of the different search 
mechanisms (and, to some extent, also the stringent cri- 
teria to define these four elementary interactions) are not 
fully resolved. Moreover, it has been suggested that un- 
der tight(er) binding conditions, the sliding of the protein 
becomes subdiffusive due to th e local structure landscape 
of a heteropolymer DNA (j332l ). This complication, how- 
ever, is expected to be relaxed in a mo re lo osely bound 
search mode of the searching protein (|326[ ). We here 
adopt the latter view of normal diffusion, which is cor- 
roborated by the experimental study in the next subsec- 
tion. 



D. A unique situation: Pure one-dimensional search of 
SSB mutants 

In previous studies, the ID sliding problem had al- 
ways been considered as a problem of 3D diffusion which 
is enhanced by ID diffusion. Thus, workers such as Berg, 
Winter, and von Hippel (|31lD assumed that proteins non- 
specifically bound would on average unbind before find- 
ing their specific binding sites. This results in an en- 
hancement of specific binding rates that is proportional 
to the ID sliding rate, but the overall specific binding 
rate depends linearly on protein concentration. These 
studies neglect the possibility that the protein finds its 
specific site before unbinding. Given the experimental 
conditions under which transcription factor binding has 
been previously studied, this appr oxim ation is appropri- 
ate. However, as demonstrated in (j333[ ). this mechanism, 
in which the unbinding rate is much lower than the spe- 
cific binding rate, occurs for the ID search of DNA by the 
single-stranded DNA binding protein T4 gene 32 protein 
(gp32). This fast ID search rate is essential for gp32 to be 
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FIG. 39 Dimensional binding rate ka in 1/s as function of 
protein concentration C in M, for parameters corresponding 
to 100 mM salt. The fitted ID diffusion constant for sliding 
along the dsDNA is -Did = 3.3 • lO^^cm'^/sec, locate d ni cely 
within the experimental value 10^* . . . 10~^cm^/sec (|259l ). 



able to quickly find specific locations on DNA molecules 
that are undergoing replication, and which have large 
sections of single-stranded DNA exposed for gp32 bind- 
ing. The resulting nonlinear concentration dependence of 
gp32 binding will likely have significant effects on gp32's 
ability to find its replication sites as well as its ability to 
recruit other proteins during replication. If these non- 
linear effects also occur for TFs, this characteristic will 
strongly affect regulatory processes governed by protein 
binding. 

Results from the single DNA overstretching experi- 
ment are shown in figure [39] along with the results from 
the t heoretical and simulations analysis from references 
(|333l ). The scaling of search rate as function of concen- 
tration is described by the relation 



ka — Dicin-l 



(51) 



Possibly, also other binding proteins are able to perform inter- 
segmental jumps. 



obtained for the pure ID search of random walkers of 
line density Tig searching along the DNA. For low con- 
centrations, the McGhee and von Hippel isotherm (|334h 
predicts a linear relation between no and the volume con- 
centration C; thus, ka (X C^. The experimental evidence 
for the purely linear search process, as shown in figure [39] 
for 100 mM salt, was foun d for a la rge range of salt con- 
centrations, see references (|26ll : l333| ) for details. The c ase 
of high line density of proteins was discussed in (1331: 



E. Levy flights and target search 

We now address the general search process with inter- 
change of ID and 3D diffusion, and intersegmental jumps. 
To this end, we first quickly rev iew the definition of Levy 

ffights HH; [111 [Mi [Mi; [MO) . 
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Levy flights (LFs) are random walks whose jump 
lengths X are distribu ted l ike \{x) ~ |a;|~^~" with ex- 
ponent < a < 2 (jlSOf ). Their probability density 
to be at position a; at time t has the characteristic 
function P{q,t) = J^^e"'''P{x,t)dx = exp {-Di^\q\°'t), 
a consequ ence of the generalised central limit theorem 
(I342I : |343[ ): in that sense, LFs are a natural extension 
of normal Gaussian dif fusio n (a = 2). LFs occur in a 
wide range of systems (|338l ): in particular, they repre- 
sent an optimal search mechani sm i n contrast to locally 
oversampling Gaussian search (|34l[ ). Dynamically, LFs 
can be described by a space-fractional diffusion equa- 
tion dP/dt = Di^d°'P{x,t)/d\x\°' , a convenient basis to 
introduce additional terms, as shown below. I?l is a 
diffusion constant of dimension cm" /sec, and the frac- 
tional derivative is defined via its F ourie r tran s form , 
^{d"'Pix,t)/d\x\°'} = -\q\°'P{q,t) dH; HH [340h . 
LFs exhibit superdiffusion in the sense that (|a;|<^)2/C - 
-Dl^^^" (0 < C < ck) (|338h . spreading faster than the 
linearly growing mean squared displacement of standard 
diffusion (a = 2). A prime example of an LF is linear 
particle diffusion to next neighbour sites on a fast folding 
('annealed') polymer that permits intersegmental jumps 
at chain contact p oints (sec figure [38)1 caused by polymer 
looping (j323l : 1324 1. In fact, the contour length |x| stored 
in a loop between such contact points is distributed in 3D 
like A(a;) ~ |a;|~^~", where a = 1/2 for Gaussian chains 
{6 solvent), an d a ~ 1.2 for self-avoiding walk chains 
(good solvent) (|l54h . 

In our description of the target search process, we use 
the density per length 'n{x,t) of proteins on the DNA 
as the relevant dynamical quantity [x is the distance 
along the DNA contour). Apart from intersegmental 
transfer, we include ID sliding along the DNA with dif- 
fusion constant Db, protein dissociation with rate fcoff 
and (re)adsorbtion with rate fcon from a bath of proteins 
of concentration ribuik- T he d ynamics of n(x, t) is thus 
governed by the equation (|344[ ) 

d f d" \ 

Ql<-^t)= y'^d^2+D^Q^~k^i^)n{x^t) 

+fcon"buik - i{t)5{x). (52) 

Here, j{t) is the flux into the target located at a; = 0. 
We determine the flux j{t) by assuming that the tar- 
get is perfectly absorbing: n(0,t) = (j346( ). Be ini- 
tially the system at equilibrium, except that the tar- 
get is unoccupied; then, the initial protein density is 
no = n{x,0) = fcon'^buik/fcoff-^^ The total number of 
particles that have arrived at the target up to time t is 
J{t) = Jq dt' j{t'). We derive explicit analytic expres- 
sions for J{t) in different limiting regimes, and study the 
general case numerically. We use J{t) to obtain the mean 



Note that the dimension of the on and off rates differ; while 
['^off] = sec^-"-, we chose [fcon] = cm^/sec. 
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Q =3/2, Eq. (14) 




Q' = 3/4, Eq. (14) 




- - Q' = 1/4, Eq. (14) 




a = 3/4, Eq (16) 




t> = 1/4, Eq (17) 







FIG. 40 Optimal choice of off rate kos as function of the LF 
diffusion c onst ant, from numerical evaluation of the model in 
reference (|344l ) . The circle on the abscissa marks where 
becomes in the case a < 1/2. 

first arrival time T to the target; in particular, to find the 
value of fcoff that minimises T. 

The various regimes of target search embodi ed in equa- 
tion ([5^ arc discussed in detail in references (j344l : 1345.) . 
The main result for the efficiency of the related search 
process is summarised in figure 1401 i.e., which protein 
unbinding rate fcoff optimises the mean search time T. 
Three regimes can be distinguished: 

(i) Without Levy flights, we obtain fc°g' = fc^^: the 
proteins should spend equal amounts of time in bulk and 
on the DNA. This corresponds to the res ult obtain ed for 
single protein searching on a long DNA (|326l : [H^) . 

(ii) For a > 1, i.e., when DNA is in the self-avoiding 
regime, we find 

fcX-(«-l)fc;„: (53) 

The optimal off rate shrinks linearly with decreasing a. 

(iii) For a < 1, i.e., when DNA leaves the self-avoiding 
phase (e.g., by lowering the temperature or introducing 
attractive interactions) the value of k°p approaches zero 
as the frequency of intersegmental jumps (oc Dl) in- 
creases: The Levy flight mechanism becomes so efficient 
that bulk excursions become irrelevant. At a = 1/2, the 
case of the ideal Gaussian chain, we observe a qualitative 
change: When a < 1/2, the rate k°p reaches zero for 
finite values of the rate for intersegmental jumps. Note 
that when a < 1, the spread of the Levy fiight (~ t^^") 
grows faster than the number of sites visited (~ t), 
rendering the mixing effect of bulk excursions insignif- 
icant. A scaling argument to understand the crossover 
at a = 1/2 relates the probability density of first arrival 
with the width (~ i^/") of the Green's function of a Levy 
flight pfa ~ We see that the associated mean ar- 
rival time becomes finite for < a < 1/2, even for the 
infinite chain limit considered here. 

We remark that this model is valid for an annealed 
DNA only. This means that the chain can equilibrate (at 
least, locally) on the typical time scale between interseg- 
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mental jumps. Even though real DNA in solution might 
not be fully annealed, features of this analysis will reflect 
on the target search. A more detailed study of different 
regimes of DNA is under way. 

F. Viruses — extreme nanomechanics 

Viruses have played an important role in the discovery 
of the mechanisms underlying gene regulation, see, for in- 
stance, reference (j297h . From a nanoscience perspective, 
viruses are of interest on their own part. During the as- 
sembly of many viruses, the viral DNA of several /im 
length is packaged into the capsid, the protein container 
making up most of the virus, by a motor protein. This 
motor packages the DNA by exerting forces of up to 60 
pN or more, causing pr essures bu ilding up in the capsid 
of the order of 6 MPa (|347l : [sish . The size of the cap- 
sid spans few tens of /zm, and i s therefor e coinparable t o 
the persistence length of DNA iH [MO; ImI [Ml). 
Therefore, fluctuation-based undulations are suppressed, 
and the chain can be approximately thought of as being 
wound up helically like thread on a bobbin, or like a ball 
of yarn. Ultimately, a relatively highly ordered 3D con- 
figuration of the DNA inside the capsid is achieved, which 
under certain conditions may even lead to local crystalli- 
sation of the DNA ^34^; .353.; iM; ^; ■ It 
is generally argued that this ordered arrangement helps 
to avoid the creation of entanglements or even knots of 
the wound-up DNA, thus enabling easy ejection, i.e., re- 
lease of the DNA once the phage docks to a new host cell; 
this ejection is not assisted by the packaging motor, but 
it can be facilitated by host cellular DNA polymerase, 
which starts to transcribe the DNA and thereby pulls it 
out of the capsid ([l|; [2 [96l : l354[ ). Det ails o n the packaging 
energetics can be found in reference (|353f ). and the works 
cited therein. Model calculations for the entropy loss, 
binding and twist energy, and electrostatic forces that 
need to be overcome on packaging reveal, that at higher 
packaging ratios the packaging force almost exclusively 
comes from the electrostatic repulsion. 

VII. FUNCTIONAL MOLECULES AND NANOSENSING 

Complex molecules can be endowed with the distinct 
feature that they contain subunits which are link ed t o 
each other mechanically rather than chemically (|357t ). 
The investigation of the structure and properties of such 
interlocked topological molecule s is subject of the grow- 
ing field of chemical topology (|l30t ): while speculations 
about the possibility of catenanes (Olympic rings) date 
back to the early 20th century lectures of Willstiitter, 
the actual synthesis of catenanes and rotaxanes suc- 



catena (lat.), the chain. 

rota (lat.), the wheel; axis (lat.), the axle. 



ceeded in 1958 (l357l ). Modern organic chemistry has seen 
the development of refined synthesis methods to generate 
topological molecules. 

A. Functional molecules 

In parallel to the miniaturisation in electronics (13581 ) 
and the possibility of manipulating single (bio)molecules 
(|359| ). supramolecular chemistry which ma kes use o f 
chemical topology properties is coming of age (|360l : l36l[ ). 
Thus, rotaxane-type molecules are believed to be the 
build ing blocks for certain nanoscale machines and mo- 
tors (j362l ). so-called hermaphrodite molecules have been 
shown to perform linear relative motion ("contraction 
and stretching") ([363f), and pirouetting molecules have 
been synthesised (|364l ). Moreover, topological molecules 
are thought to become components for molecular elec- 
tronics swi tching de vices in memory and computing ap- 
plications (|365l : l366l ). These molecular machines are usu- 
ally of lower molecular weight, and their behaviour is es- 
sentially energy-dominated in the sense that their confor- 
mations and dynamical properties are governed by exter- 
nal and thermal activation in an energy landscape. The 
understanding of the physical properties and the theoreti- 
cal modelling of such designer molecules and their natural 
biological counterparts has increasingly gained momen- 
tum, and the st age i s alrea.dy set for the next g,eneration 
of applications (IsEs!; 13591; ^SM, ^pl [M3; [3651363: [365 
ii6; 367; 368; 369, 370; 371; 37^ 

In reference (|230( ) we introduced some basic concepts 
for functional molecules whose driving force is entropic 
rather than energetic, see also the more recent pub- 
lications in chemistry journals (|375 |374| ). Entropy- 
functional molecules will be of higher molecular weight 
(hundred monomers or above) in order to provide suf- 
ficient degrees of freedom such that entropic effects can 
determine the behaviour of the molecule. The potential 
for such entropy-driven functional molecules can be an- 
ticipated from the classical Gibbs Free energy 

.^^U-TS; (54) 

in functional molecules, ^ is minimised mainly by vari- 
ation of the internal energy U representing the shape 
of the energy landscape of the functional unit. New 
types of molecules were proposed for which is min- 
imised by variations of the entropy S, while the e nergies 
and chemical bondings are left unchanged (|230l ). The 
entropy-functional units of such molecules can be specifi- 
cally controlled by external parameters like te mperature , 
light flashes, or other electromagnetic fields (|360l : l36ll ). 
We note that DNA is already being studied as a macro- 
molecu lar p rototype building block for molecular ma- 
chines (|369t ). 

A typical example is the molecule shown in figure 1411 
According to the arrangement of the sliding rings 1 and 2, 
this compound exhibits the unique feature of a molecule 
that it can slide laterally. Suggested as precursors of 
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FIG. 41 Molecular muscle consisting of two interlocked rings 
1 and 2 with attached rod-like molecules. Within this struc- 
ture, sliding rings, 3, can be placed, which, if activated, tend 
to contract the muscle by entropic forces. 



molecular muscles (|363f ). this compound could be pro- 
pelled with internal entropy-motors, which entropically 
adjust the elongation of the muscle. In the configuration 
shown in figurelHl the sliding ring 3 creates, if activated, 
an entropic force which tends to contract the "muscle" ; 
at T = BOOK and on a typical scale x — 10 nm, the 
entropic force ksT/x is of the order of pN, and thus 
comp arable to the force created in biological muscle cells 
(|375f ). Molecular muscles of such a make can be viewed 
as the nano-counterpart of m acroscopic muscle models 
proposed by de Gennes (|376[ ). in which the contraction 
is based on the entropy difference between the isotropic 
and nematic phases in liquid crystalline elastomer films 
(l377h . 

Similarly, on e mi ght speculate whether the DNA helix- 
coil transition (|262f ) in multiplication setups could be fa- 
cilitated in the presence of pre-ring molecules which in 
vitro attach to an opened loop of the double strand and 
close, creating an entropy pressure which tends to open 
up the vicinal parts of the DNA which are still in the he- 
lix state. Finally, considering molecular motors, it would 
be interesting to design an externally controllable, purely 
entropy-driven rotating nanomotor. 

Numerous additional nanoapplications of biopolymers 
appear in current literature. An interesting example is 
the nanomotor created by a DNA ring in a periodically 
driven external field, for instance, a focused ligh t beam 
inducing localised temperature variations (j378l : l379t ). 
The speeds possibly attained by such a device are of the 
order of those reached by biological organisms. Such a 
nanorotor could be used to stir smallest volumes in higher 
viscous environments. 



B. Nanosensing 

The advances in minituarization of reactors and de- 
vices also brings along the need of probes, by which 
smallest volumes can be tested. For instance, microar- 
rays used in genomics require sensors to detect the pres- 
ence of certain proteins (often at small concentrations) 
in a microdish, without disturbing the environment in 
the small volume too much. Similarly, single molecule 
experiments require specific local detection possibilities. 

A fine example for a potential nanosensorc is the blink- 
ing behaviour of a fluorophorc-quencher pair mounted on 
the denaturation wedge as shown in figurc B^ This setup. 




FIG. 42 Molecular beacon based on local DNA denaturation. 
The green blobs may represent single-stranded DNA bind- 
ing proteins, or more specifically binding proteins binding or 
other molecule to a custom designed DNA sequence along the 
denaturation fork. Bound proteins stabilize the denatured 
fork and change the spectrum of the beacon. 



similar to the ones described in references ()249l : l281h 
works as follows. As long as the dsDNA is intact, flu- 
orophore and quencher are in close proximity. Once 
they come apart from one another when the denaturation 
wedge opens up, the incident laser light causes fluores- 
cence of the dye. The on/off blinking of this "molecular 
beacon" can be monitored in the focus of a confocal mi- 
croscope, or, depending on the intensity of the emitted 
light, by a digital camera. The blinking renders immedi- 
ate information about the state of the bp, that is tagged 
by the dye-quencher pair. Fluorescence, that is, indi- 
cates that the bp is currently broken. It is therefore ad- 
vantageous to define the random variable I(t) with the 
property 

I(t) f if base-pair at x ~ xt is closed 
^ ' 1 1 if base-pair at x ~ xt is open ' ^ ' 

and in experiments one typically measures the corre- 
sponding blinking autocorrelation function 



A{t) = (/(t)/(0)) - (/), 



(56) 



where (/)oq is the (ensemble) equilibrium value, or its 
spectral decomposition 



Ait) 



/(T)exp ( ) dr, 



where 



/(r) = ^Tp2<5(r-r,), 

p#0 



(57) 



(58) 



42 



T T"' i ' i ' i ' i ' i '""i T 

I I I I I I I I I 




10"^ 10"' 10° 10' 10^ 10^ 



t (in units of k ') 

FIG. 43 Spectral response of the denaturation beacon in 
the presence of single-stranded DNA binding proteins. Top: 
Relaxation time spectrum, bottom: blinking autocorrelation 
function. 

is called the relaxation time spectrum. 

Figure 02] shows an example for the achievable sen- 
sitivity of such nanobeacons, in an example where the 
denaturation wedge is in solution together with a certain 
concentration (proportional to k, compare Sec. IVG) 
of selectively single-stranded DNA binding proteins, as 
discussed previously. It is distinct how both measur- 
able signals, A{t) and /(r) change with varying SSB- 
concentration. 



VIII. SUMMARY 

Biopolymers such as DNA, RNA, and proteins are 
indispensable for their specificity and robustness in all 
forms of life. Given their detailed physical proper- 
ties such as DNA's persistence length of some 50nm or 
its local denaturation in nano-bubbles already at room 
temperature, and biochemically relevant interfaces such 
as 10-20 bps, they deeply stretch into the nanoscience 
domain. This statement is twofold in the following 
sense. Firstly, nanotechniques such as atomic force mi- 



croscopes become important tools to manipulate and 
probe biomolecules and their interaction even on the sin- 
gle molecule level. Secondly, biomolecules are entering 
the stage as nanotools such as nanosensors, functional 
molecules, or highly sensitive force transducers. 

The possibility to perform controlled experiments on 
biomolecules, for instance, to measure the force-extension 
curves of single biopolymers, also opens up novel possi- 
bilities to test new physical theories. The foremost ex- 
amples may be the exploration of persistence lengths and 
other polymer physics properties, and the statistical me- 
chanical concepts relevant for small system sizes. The 
latter are known under the keyword of the Jarzynski re- 
lation connecting the non-equilibrium work performed on 
a physical system with the difference in the thermody- 
namic (i.e. , equilibr ium) potential between initial and fi- 
nal states (I380l : l38il) . However, there exist by now several 
similar theories addressing different physical quantities, 
such as the concept of e ntropy production along a single 
particle trajectory (|382f ). 

This review summaries fundamental physical proper- 
ties of DNA, and their relevance for both biological pro- 
cesses and technological applications. The extensive list 
of references will be useful for further studies on specific 
topics covered herein. We are confident that the role of 
biomolecules in technology, not at least for biomedical 
applications, will experience a dramatic increase during 
the coming years and will enable us to extend current 
physical understanding of fundamental processes. 
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APPENDIX: A polymer primer. 

In this section, we introduce some basic concepts from 
polymer physics. Starting from the random walk model, 
we define the fundamental measures of a polymer chain, 
before introducing excluded vol ume . Fo r mo re details, 
we refer to the monographs (0; [T9ll : [383h . 

The simplest polymer model is due to Orr (j384l ). It 
models the polymer chain a a random walk on a periodic 
lattice with lattice spacing a. Then, each monomer of 
index i is characterised by a position vector with i = 
0, 1, . . . , iV. The distance between monomers i and i + 1 



is called a^+i 



R 



■i+l ■ 



-R^. Consequently, the end-to-end 



vector of the polymer is 



(59) 



Different have completely independent orientations, 
such that we immediately obtain the average ((•) over 
different configurations) squared end-to-end distance 

Ro ~ N^/^a is a measure for the size of the random walk. 
An alternative measure of the size of a polymer chain 
is provided by its radius of gyration Rg, which may be 
measured by light scattering experiments. It is defined 

by 



1 ^ 

i=0 



(61) 



and measures the average squared distance to the centre 
of gravity, 



1 " 

i=0 



(62) 



Expression ([CTjl can be rewritten as 

N-l N 

Rl = il + N)-^Y1 E ((R^-R.)')- (63) 



1=0 



With Rj - R, 

?2 



Rl = a^N{N + 2)/[6{N + 1)] 

2 

Rg ~ ^N, and therefore: 



a„, one can easily show that 
For large N, that is. 



Rn ^ Rn 



(64) 



On a cubic lattice in d dimensions, each step can go 
in 2d directions, and for a general lattice, each vector a^ 
will have possible directions. The number of distinct 
walks with N steps is therefore fi^ . Denote 9TAr(r) the 
number of distinct walks with end-to-end vector r, the 
probability density function for a given r is 



p(r) 



9^7v(r) 

Er^^(l 



(65) 



For large TV, due to the independence of individual a^, 
this probability density function will acquire a Gaussian 
shape. 



p(r) = 



V27r7Va2 



d/2 



cxp 



dr2 



2Na^ 



(66) 



where the normalisation is such that (r^) — Na^ . From 
this expression, we can deduce that the number of degrees 
of freedom of a closed random walk chain is proportional 
to 7V~'^/2, the entropy loss suffered by a chain subject to 
the constraint r = 0. On a general lattice. 



(67) 



with the connectivity constant /i, a measure for in how 
many different directions the next bond vector can point 
(/^ = 2c? in a cubic lattice). At fixed end-to-end distance, 
the entropy of the random walk becomes S{v) — Sq ~ 
dr"^ /{2Na?) where 5*0 absorbs all constants. For the free 
energy ^(r) = E — fears' (r) we therefore obtain 



^(r) = ^0 



dkeTr^ 

-2Rr 



(68) 



i.e., the random walk likes to coil, the restoring force 
— V^(r) being linear in r. This is often called the en- 
tropic spring character of a Gaussian polymer. Note that 
the 'spring constant' increases with temperature ('en- 
tropy elasticity'). 

In this random walk model of a polymer chain, it is 
straightforward to define the persistence length of the 
chain. By this we mean that successive vectors a^ are 
not independent, but tend to be parallel. Over long 
distance, this correlation is lost, and the chain behaves 
like a random walk. Due to the quantum chemistry of 
the monomers, an adjacent pair of vectors a.i,a.i^i in- 
cludes preferred angles, for carbon chains leading to the 
trans/gauche configurations. This feature is captured 
schematically in the freel y rotating chain as depicted in 
figure SH Following (jl9lh , we can obtain the correlation 
(a„ • am) as follows. If we fix all vectors a,„, . . . ,a„_i, 
then the average (a„)^^^^^^^ ^^ed = a„_iC0s6'. 

Multiplication by a^ produces 



(Sm ■ an)a„ ....a„_i fixcd — ' ^" — 1 COS ( 



(69) 



Averaging over the a„i, . . . ,a„_i leads to the recursion 
relation (a,„ • a„) = (a,„ • a„_i)cos6'. With the initial 
condition (a'^) — a^, we find 



(a„, - Sin) = cos I' 



(70) 



Thus, if = 0, we obtain a rigid rod behaviour, while 
for 6^0, there occurs an exponential decay of the cor- 
relation between any two bond vectors a„ and a^. This 
defines a length scale 



log cos t 



(71) 
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FIG. 44 Freely jointed chain, in which successive bond vectors 
include an angle 6. 



the 'persistence length' of the chain. It diverges for 
9-^0, while for = 90°, it vanishes, corresponding to 
the random walk model discusses above ('freely jointed 
chain'). As 



k— — o 



1 + 2 ^ cos'^ e ] = 



k=l 



, 1 + cos 9 

1 — cos 9 ' 



we find Rq 



(72) 

a^N{l + cos9)/{l — cos9), i.e., statistically, 
the freely jointed chain behaves the same as the random 
walk chain, but with a rescaled monomer length. The 
statistical unit in a polymer chain is often taken to be 
the Kuhn length £k = 2£p. 

Above chain models are often referred to as being 
phantom, i.e., the chain can freely cross itself. A physical 
polymer possesses an excluded volume and behaves like 
a so-called self-avoiding chain. Mathematically, this can 
be modelled by self-avoiding walks. To include the ma- 
jor effects, it is suflacient to follow a simple argument due 
to Flory. Consider a chain with unknown radius R and 
internal monomer concentration Cjnt — N/R'^. Assum- 
ing that the self-avoiding character is due to monomer- 
monomer interactions, the repulsive energy is propor- 
tional to the squared concentration, i.e.. 



1 



(73) 



with the excluded volume parameter v{T) {v{T) — 
(1 - 2x)a'^ in Flory's notation, where the 9 condition 
X = 1/2 corresponds to ideal chain behaviour). To ob- 
tain the total averaged repulsive energy ^rcp|totj we need 
In a mean field approach, we take 
We therefore obtain 



to average over c 
(c2) ^ (c)2 ~ cL, 



=^rep|tot — Tv{T)cf^^^R'^ 



Tv{T) 



(74) 



favouring large values of R. This 'swelling' competes with 
the entropic elasticity contribution ,^c\ — TR^/{Na?). 



The total free energy becomes 



^ R' 



(75) 



with a minimum at R"^^ = v{T)a^N^, so that the Flory 
radius scales like 



Rf ~ AN" , therefore v 



2 + d 



(76) 



The values of the exponent v{d = 2) = 3/4 and v{d = 
3) — 3/5 are extremely close to the best known values 
0.75 and 0.588.39 



Polymer networks. 



A linear excluded volume polymer chain has the size 



Ri ~ AN 



2u 



(77) 



with V = 0.75 in d = 2, and ly = 0.588 in d = 3. Its 
number of degrees of freedom is given in terms of the 
configuration exponent 7 such that 



li'^N-'- 



(78) 



where 7 = 1.33 in ci = 2 and 7 = 1.16 in d = 3. 

Remarkably, similar critical exponents can be obtained 
for a general polymer network of the type shown in figure 
[79l as originally by Duplan tier (|l47l : Il54l ). compare also 
and in references (|l57l : l385l ) : In a network Q consisting of 
M chain segments of lengths si, . . . , sj\/ and total length 
L = the number of configurations ujg scales as 



ujg{si, . . .,sj^) 



M yQ \ — ' ■ ■ ■ ' — 



(79) 



where yg is a scaling function, and /i is the effective con- 
nectivity constant for self-avoiding walks. The exponent 
7g; is given by 75; = 1 — dvC -f YliNyi ^n<^n, where v is 
the swelling exponent, C is the number of independent 
loops, UN is the number of vertices with N outgoing legs, 
and (Ttv is an exponent associate d with suc h a vertex. In 
d = 2, this exponent is given by (|l47l : Il54[ ) 



(J AT 



{2- N)i9N + 2) 
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(80) 



In the dense phase in 2p (|20! 
and at the O transition (138' 
obtained. 



[21?il : [2T^ : [213l : [2li : [386h . 

analogous results can be 



An interesting discussion abou t the flaws underlying this reason- 
ing can be found in reference | |146D . 
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FIG. 45 Polymer network Q with vertices (•) of different order 
A'', where A'' self-avoiding walks are joined (ni = 5, na = 4, 

Hi = 3, 715 = !)■ 



monomers, a different scaling behaviour emer ges i f the 
system is not below but right at the 8 point (|210t ). In 
this case the number of configurations of a general net- 
work Q is given by 

c^g(si, . . . , sm) - sY'yg {—,■■■, ^) , (84) 

\SM sj^ J 

with the network exponent 

7g; = 1 - dvC + ^ un'Sn ■ (85) 

Ar>l 

Over lined symbols refer to polymers at the poin t. In 
d = 2, = 4/7 and ctat = (2 - N){2N + l)/42 (|210l ). 



First, consider the dense phase in 2D. If all segments 
have equal length s and L — J\fs, th e configu ration num- 
ber LOg of such a network scales as (|209l : l210t) 



ug{s)r^ujo{L)s^^ , (81) 

where liJo{L) is the configuration number of a simple ring 
of length L. For dense polymers, and in contrast to the 
dilute phase or at the O point, ujq{L) (and thus ojg) de- 
pends on the bo undary co n ditions an d even on the shape 
of the system ([2T0I: [2121: [213t [21I [386h . For exam- 
ple, for periodic boundary conditions (which we focus 
on in this study) corresponding to a 2D torus, one finds 
oJo{L) with a connectivity constant iJ. and 

^' = 1 (|210l ). However, the network exponent 

7g = 1 - >C + ^ riNapf (82) 

Ar>l 

is universal and depends only on the topology of the 
network by the number £ of independent loops, and by 
the number n^v of vertice s of order N with vertex expo- 
nents crjv = (4 - N'^)/32 (|209l: [210f) . For a linear chain, 
the corresponding exponent 7iin_= 19/1 6 has been veri- 
fied by numerical simulations (|210l : \38§l ). For a network 
made up of different segment lengths {si} of total length 
L = J2'iLi Sj ; equ ation ((HJ) generalises to (cf. section 4 
in reference (|210l )) 

wg(si, . . . , Sat) ~ uJo{L) s]^ yg 



(83) 

which involves the scaling function yg. 

For polymers in an infinite volume and endowed 
with an attractive interaction between neighbouring 



Note that due to the factor u)q{L) the exponent of s is 75, and 
not 75 — 1 like in the expressions used in the dilute phase lll54f) 
or at the point, for which ljo{L) ~ L"''". However, for 2D 
dense polymers one has dv = 1, so that both definitions of 75 
are equivalent, cf. section 3 in reference l l2f Ol ). 
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